document OutdatedLibraries false positives (prometheus-alerts#20) authored by anarcat's avatar anarcat
......@@ -215,6 +215,26 @@ To check the entire fleet, run this command in [Fabric](howto/fabric):
fab fleet.pending-restarts
Note that there's a false alarm that occurs regularly here because
there's lag between `needrestart` running after upgrades (which is on
a `dpkg` post-invoke hook) and the metrics updates (which are on a
timer running daily and 2 minutes after boot).
If a host is showing up in an alert and the above fabric task says:
INFO: no host found requiring a restart
It might be the timer hasn't ran recently enough, you can diagnose
that with:
systemctl status tpa-needrestart-prometheus-metrics.timer tpa-needrestart-prometheus-metrics.service
And, normally, fix it with:
systemctl start tpa-needrestart-prometheus-metrics.service
See [issue `prometheus-alerts#20`](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/issues/20) to get rid of that false positive.
Packages are blocked from upgrades when they cause significant
breakage during an upgrade run, enough to cause an outage and/or
require significant recovery work. This is done through Puppet, in the
......
......