Out-of-Band (BMC based) Data Center Monitoring DMTF Redﬁsh API Integration with Nagios
HPC Center Planning and Operations
State of the Practice
TimeMonday, November 12th11:30am - 11:50am
DescriptionNagios is an industry standard for HPC infrastructure monitoring including hosts and associated hardware components, networks, storages, services, and applications. However, there are significant issues with traditional Nagios including 1) Nagios requires human intervention for the definition and maintenance of remote hosts configurations in Nagios Core. 2) It requires Nagios Remote Plugin Executor (NRPE) on Nagios Server and each monitored remote host. 3) It also mandates Nagios Service Check Acceptor (NSCA) on each monitored remote host. 4) And also requires Check specific agents (e.g. SNMP) on each monitored remote host. In order to address these issues, we have integrated Distributed Management Task Force (DMTF)’s Redfish API with Nagios core. DMTF's Redfish API is an open industry standard specification and schema designed to meet the expectations of end users for simple, modern and secure management of scalable platform hardware. Redfish API is essentially out-of-band protocol which is implemented in baseboard management controller (BMC) of the system. Redfish API supports network-based auto-discovery which is quite instrumental in automating configuration of remote hosts. Nagios plugins will be directly communicating to BMC so it eliminates the requirement of any agent and configuration on remote hosts. Redfish API integration with Nagios is potentially a huge paradigm shift in Nagios based monitoring in terms of: 1) simplifying communication between Nagios server and monitored hosts; and 2) eliminating computational cost and complexity of running Nagios native protocols (NRPE and NSCA) and other agents (SNMP) on the monitored hosts.