Check OnionPerf instances from Nagios

There are a few things that we can check, some are easier than others.

  • Is the host up and the webserver running? (this is easy with built-in checks)
  • Is the tgen server running on the Internet? (this is easy with built-in checks)
  • Is the analyze task running? (needs a plugin)
  • Is the tgen server running on an Onion service? (needs a plugin)

For monitoring the Onion service, I'm looking at reusable plugins, so there are two tests. One checks to see how old the descriptor is and a second test actually tries connecting to the service. The first of these tests is affected by legacy/trac#28269 (moved) (but not blocked) and both are blocked by https://github.com/robgjansen/onionperf/issues/42.

As a workaround for monitoring the Onion service, which really is the bit that is breaking, we can instead monitor the analysis of timeouts from Tor Metrics' CSV files.