... | ... | @@ -89,6 +89,41 @@ authorized_for_full_command_resolution=user1,foo,bar,<new user> |
|
|
authorized_for_configuration_information=user1,foo,bar,<new user>
|
|
|
```
|
|
|
|
|
|
## Pager playbook
|
|
|
|
|
|
### What is this alert anyways?
|
|
|
|
|
|
Say you receive a mysterious alert and you have no idea what it's
|
|
|
about. Take, for example, [tpo/tpa/team#40795](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40795):
|
|
|
|
|
|
09:35:23 <nsa> tor-nagios: [gettor-01] application service - gettor status is CRITICAL: 2: b[AUTHENTICATIONFAILED] Invalid credentials (Failure)
|
|
|
|
|
|
To figure out what triggered this error, follow this procedure:
|
|
|
|
|
|
1. log into the Nagios web interface at https://nagios.torproject.org
|
|
|
|
|
|
2. find the broken service, for example by listing all [unhandled
|
|
|
problems](https://nagios.torproject.org/cgi-bin/icinga/status.cgi?allunhandledproblems)
|
|
|
|
|
|
3. click on the actual service name to see details
|
|
|
|
|
|
4. find the "executed command" field and click on "Command Expander"
|
|
|
|
|
|
5. this will show you the "Raw commandline" that nagios runs to do
|
|
|
this check, in this case it is a NRPE check that calls
|
|
|
`tor_application_service` on the other end
|
|
|
|
|
|
6. if it's an NRPE check, log on the remote host and run the command,
|
|
|
otherwise, the command is ran on the nagios host
|
|
|
|
|
|
In this case, the error can be reproduced with:
|
|
|
|
|
|
root@gettor-01:~# /usr/lib/nagios/plugins/dsa-check-statusfile /srv/gettor.torproject.org/check/status
|
|
|
2: b'[AUTHENTICATIONFAILED] Invalid credentials (Failure)'
|
|
|
|
|
|
In this case, it seems like the status file is under the control of
|
|
|
the service administrator, which should be contacted for followup.
|
|
|
|
|
|
# Reference
|
|
|
|
|
|
## Design
|
... | ... | |