check-01.torproject.org (aka check.torproject.org) has load issues
check-01 has been warning about load for a long time now. nagios believes it's been having CRITICAL load issues for a few hours now, but it's actually been flapping for far longer:
Sat. 21:01 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
Sat. 20:16 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
Fri. 22:09 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
Fri. 21:39 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
Wed. 16:22 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
Wed. 14:37 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
July 16 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
July 16 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
July 12 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
July 12 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
July 09 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
July 09 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
July 02 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
June 29 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
June 29 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
June 15 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
June 15 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
June 13 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
June 09 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
May 31 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
May 04 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 29 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 29 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 28 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 28 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 26 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 18 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 18 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 13 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 13 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 12 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 11 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 09 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 08 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
April 08 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
March 23 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
March 23 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
March 22 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
March 22 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
March 21 [1/1] nagios@hetzner-hel1-01.torproject.org ** PROBLEM Service Alert: check-01/load is CRITICAL ** (nagios rapports tor)
that's a notmuch search for check-01 load CRITICAL
in my inbox, there are many more notifications for WARNING events of course.
according to grafana, things are particularly bad since june:
https://grafana.torproject.org/d/Z7T7Cfemz/node-exporter-full?orgId=1&var-job=node&var-node=check-01.torproject.org&var-port=9100&from=now-1y&to=now
we can see the "user" CPU usage (in blue) maxing out in april, then a brief pause in may, and we've been basically maxing out since june. memory wise, there seems to be somewhat less pressure: little swap used, but then again also most of the memory is used by apps, which is sub optimal.