Changes
Page history
prom: fill more sections (
#41655
)
authored
Oct 07, 2024
by
anarcat
Show whitespace changes
Inline
Side-by-side
service/prometheus.md
View page @
3879b80f
...
...
@@ -2526,15 +2526,13 @@ No major issue resolved so far is worth mentioning here.
## Maintainers
The Prometheus services have been setup and are managed by anarcat
inside TPA. The internal Prometheus server is mostly used by TPA staff
to diagnose issues. The external Prometheus server is used by various
TPO teams for their own monitoring needs.
inside TPA.
## Users
<!-- TODO who the main users are, how they use the service. possib
ly
re
use
-->
<!-- the Personas section in the RFC, if available. -->
<!-- see overlap with above -->
The internal Prometheus server is most
ly use
d by TPA staff
to diagnose issues. The external Prometheus server is used by various
TPO teams for their own monitoring needs.
## Upstream
...
...
@@ -2589,10 +2587,14 @@ policies.
## Tests
Prometheus doesn't have specific tests, but there
*is*
a test suite in
the upstream Prometheus Puppet module.
The
`prometheus-alerts.git`
repository has tests that run in GitLab
CI, see the
[
Testing alerts section
](
#testing-alerts
)
on how to write those.
When doing major upgrades, the
[
Karma dashboard
][]
should be visited
to make sure it works correctly.
TODO: merge with alertmanager test stuff
There is a test suite in the upstream Prometheus Puppet module as
well, but it's not part of our CI.
## Logs
...
...
@@ -2810,7 +2812,8 @@ Near the end of 2024, Icinga was replaced by Prometheus and
Alertmanager, as part of
[
TPA-RFC-33
][]
.
TODO: document a little bit how the actual migration went, along with
the three stages and milestones
the three stages and milestones. see overlap with Proposed solutions
above.
Before Icinga was retired, we performed an audit of the notifications
sent from Icinga about our services (
[
#41791
][]
) to see if we're
...
...
...
...