Coordinate Onionperf monitoring alerts
Per @hiro's suggestion in irc, this is a ticket to review the Monit configuration we have for onionperf instances, to avoid duplicate checks.
At the moment, monit checks every 5 minutes for:
- whether the
onionperf measure
process exists - whether the
tgen client
process exists - whether the
tgen server
process exists - whether the tor and tgen log files are older than 10 minutes
- whether the disk space is more than 80% used
- whether the instances can reach each other on port 8080
@hiro do we have equivalent checks in prometheus? Do you think we could move to Prometheus and not use Monit?
Edited by Ana Custura