You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
NRPE stands for Nagios Remote Plugin Executor.
It's a service running on monitored hosts. An Icinga server can connect to it and tell it to locally execute pre-defined commands.
This way Icinga can monitor things that can't be checked from external but need to run something on hosts.
Not to be confused with NSCA / passive checks.
It is important for security that sending command arguments from remote isn't allowed. All commands are predefined.
They can be found in /etc/nagios/nrpe.d/ on the monitored hosts. On the icinga server side there will be a reference to them using check_nrpe.
The service is called nagios-nrpe-server, so status check is: systemctl status nagios-nrpe-server.
Sometimes a nagios-nrpe-server process gets killed due to OOM for some other reason. This manifests as ALL the checks executed via NRPE failing with CRIT but with an extra "could not connect to NRPE"-like message. Restarting the service should make them all recover.
Restart is systemctl status nagios-nrpe-server.