prom: fix link references, again authored by anarcat's avatar anarcat
......@@ -15,20 +15,20 @@ layer on top (see [Grafana][]).
## Training course plan
- Where can I find documentation? In the wiki, in [Prometheus](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus) and
[Grafana](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/grafana)
- Where can I find documentation? In the wiki, in [Prometheus service
page][] (this page) but also the [Grafana service page][]
- Where do I reach the different web sites for the monitoring service?
See the [web dashboards section](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#web-dashboards)
See the [web dashboards section][]
- Where do i watch for alerts? Join the `#tor-alerts` IRC channel! See
also [how to access alerting history](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#checking-alert-history)
also [how to access alerting history][]
- How can we use silences to prevent some alerts from firing? See
[Silencing an alert in advance](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#silencing-an-alert-in-advance) and following
- [Architecture overview](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#design)
- [Alerting philosophy](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#alerting-philosophy)
- [Adding metrics](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#adding-metrics-to-applications)
- [How to add alerts](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#writing-an-alert)
- [Queries cheat sheet](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#queries-cheat-sheet)
- [Alert debugging](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#alert-debugging):
[Silencing an alert in advance][] and following
- [Architecture overview][]
- [Alerting philosophy][]
- [Adding metrics][]
- [How to add alerts][]
- [Queries cheat sheet][]
- [Alert debugging][]:
- Alert unit tests
- Alert routing tests
- Ensuring the tags required for routing are there
......@@ -38,6 +38,18 @@ layer on top (see [Grafana][]).
- %"TPA-RFC-33-B: Prometheus server merge, more exporters"
- %"TPA-RFC-33-C: Prometheus high availability, long term metrics, other exporters"
[Alert debugging]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#alert-debugging
[Queries cheat sheet]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#queries-cheat-sheet
[How to add alerts]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#writing-an-alert
[Adding metrics]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#adding-metrics-to-applications
[Alerting philosophy]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#alerting-philosophy
[Architecture overview]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#design
[Silencing an alert in advance]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#silencing-an-alert-in-advance
[how to access alerting history]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#checking-alert-history
[web dashboards section]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#web-dashboards
[Grafana service page]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/grafana
[Prometheus service page]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus
## Web dashboards
The main Prometheus web interface is available at:
......@@ -400,9 +412,11 @@ blackbox exporter to the target at the moment the Prometheus server is scraping
the exporter.
The blackbox exporter is rather peculiar and counter-intuitive, see
the [how to debug the blackbox exporter](#debugging-blackbox-exporter) for
the [how to debug the blackbox exporter][] for
more information.
[how to debug the blackbox exporter]: #debugging-blackbox-exporter
#### Scrape jobs
In Prometheus's point of view, two information are needed:
......@@ -501,9 +515,9 @@ Prometheus targets, except that they define what the blackbox exporter will try
to reach. The targets can be `hostname:port` pairs or URLs, depending on the
nature of the type of check being defined.
See [documentation for targets in the
repository](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/blob/main/targets.d/README.md)
for more details
See [documentation for targets in the repository][] for more details
[documentation for targets in the repository]: https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/blob/main/targets.d/README.md
## Writing an alert
......@@ -527,7 +541,9 @@ Prometheus query that should evaluate to "true" (non-zero) for the
alert to fire.
Here is, for example, the first alert in the [`rules.d/tpa_node.rules`
file](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/blob/21d67a21ce9926b2eeef0e14b04bb317fb5c94c0/rules.d/tpa_node.rules):
file][]:
[`rules.d/tpa_node.rules` file]: https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/blob/21d67a21ce9926b2eeef0e14b04bb317fb5c94c0/rules.d/tpa_node.rules
```
- alert: JobDown
......@@ -672,7 +688,7 @@ built-in functions][].
[Prometheus template reference]: https://prometheus.io/docs/prometheus/latest/configuration/template_reference/
[Alertmanager template reference]: https://prometheus.io/docs/alerting/latest/notifications/
[limited set of built-in functions]: https://pkg.go.dev/text/template#hdr-Functions
[Limited set of built-in functions]: https://pkg.go.dev/text/template#hdr-Functions
[Golang templates]: https://pkg.go.dev/text/template
### Writing a playbook
......@@ -840,7 +856,6 @@ space left, to avoid warning about normal write spikes.
[metrics in your application]: #adding-metrics-to-applications
[scraped by Prometheus]: #adding-scrape-targets
[Alerting philosophy]: #alerting-philosophy
[alerting rule]: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
[recording rules documentation]: https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/#recording-rules
[aggregation operators]: https://prometheus.io/docs/prometheus/latest/querying/operators/#aggregation-operators
......@@ -1024,9 +1039,11 @@ below.
If you can't access the dashboard at all or if the above seems too
complicated, [Grafana][] can be used as a debugging tool for metrics
as well. In the [Explore](https://grafana.torproject.org/explore) section, you can input Prometheus
as well. In the [Explore][] section, you can input Prometheus
metrics, with auto-completion, and inspect the output directly.
[Explore]: https://grafana.torproject.org/explore
There's also the [Grafana availability dashboard][], see the [Alerting
dashboards][] section for details.
......
......