Open
Milestone
Oct 1, 2024–Dec 31, 2024
TPA-RFC-33-B: Prometheus server merge, more exporters
Quote from TPA-RFC-33:
In this phase, we integrate more exporters and services in the infrastructure, which includes merging the second Prometheus server for the service admins.
We may retire the existing servers and build two new servers instead, but the more likely outcome is to progressively integrate the targets and alerting rules from
prometheus2
intoprometheus1
and then eventually retireprometheus2
, rebuilding a copy ofprometheus1
in its place.Here are the tasks required here:
- LDAP web password addition (userdir-ldap-cgi#1)
- new authentication deployment on
prometheus1
(team#41636)- cleanup
prometheus-alerts
: add CI check for team label and regroup alerts/targets by team (prometheus-alerts#15)prometheus2
merged intoprometheus1
(team#41637)- priority B metrics and alerts deployment (team#41639)
- self-monitoring: Prometheus scraping Alertmanager, dead man's switch in Karma (team#41641)
- inhibitions (team#41642 (closed))
- once
prometheus1
has all the data fromprometheus2
, retire the latter (team#41638)- autonomous delivery (team#41644)
We hope to continue with this work promptly following phase A, in October 2024.
Follows %TPA-RFC-33-A: emergency Icinga retirement and followed by %TPA-RFC-33-C: Prometheus high availability, long term metrics, other exporters.
See also the kanban board.
Loading
Loading
Loading
Loading