check systemd units for failures in Prometheus
Add Prometheus metrics to get a warning when a systemd unit fails.
This can be done with the node exporter's node_systemd_unit_state
metric, but needs the systemd collector to be enabled in the node exporter's commandline flags.
Watch out for cardinal explosion on the detailed per unit stats, probably with a recording rules to drop or aggregate those. This caused an outage (out of disk, #41070 (closed)) in the past.
This is the equivalent of NRPE's systemctl is-system-running
check.
Spun out of #41639 because it was found to be more complicated than just adding an alert, and higher priority than other checks in #41791 (closed).