Skip to content
Snippets Groups Projects
Verified Commit 54ea56d4 authored by anarcat's avatar anarcat
Browse files

document how to trace check calls (used in team#40795)

parent 80b07876
No related branches found
No related tags found
No related merge requests found
......@@ -89,6 +89,41 @@ authorized_for_full_command_resolution=user1,foo,bar,<new user>
authorized_for_configuration_information=user1,foo,bar,<new user>
```
## Pager playbook
### What is this alert anyways?
Say you receive a mysterious alert and you have no idea what it's
about. Take, for example, [tpo/tpa/team#40795](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40795):
09:35:23 <nsa> tor-nagios: [gettor-01] application service - gettor status is CRITICAL: 2: b[AUTHENTICATIONFAILED] Invalid credentials (Failure)
To figure out what triggered this error, follow this procedure:
1. log into the Nagios web interface at https://nagios.torproject.org
2. find the broken service, for example by listing all [unhandled
problems](https://nagios.torproject.org/cgi-bin/icinga/status.cgi?allunhandledproblems)
3. click on the actual service name to see details
4. find the "executed command" field and click on "Command Expander"
5. this will show you the "Raw commandline" that nagios runs to do
this check, in this case it is a NRPE check that calls
`tor_application_service` on the other end
6. if it's an NRPE check, log on the remote host and run the command,
otherwise, the command is ran on the nagios host
In this case, the error can be reproduced with:
root@gettor-01:~# /usr/lib/nagios/plugins/dsa-check-statusfile /srv/gettor.torproject.org/check/status
2: b'[AUTHENTICATIONFAILED] Invalid credentials (Failure)'
In this case, it seems like the status file is under the control of
the service administrator, which should be contacted for followup.
# Reference
## Design
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment