Verified Commit 173589f0 authored by anarcat's avatar anarcat
Browse files

prom: fix link references, again

parent 297ee9ed
Loading
Loading
Loading
Loading
+36 −19
Original line number Diff line number Diff line
@@ -15,20 +15,20 @@ layer on top (see [Grafana][]).

## Training course plan

- Where can I find documentation? In the wiki, in [Prometheus](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus) and
  [Grafana](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/grafana)
- Where can I find documentation? In the wiki, in [Prometheus service
  page][] (this page) but also the [Grafana service page][]
- Where do I reach the different web sites for the monitoring service?
  See the [web dashboards section](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#web-dashboards)
  See the [web dashboards section][]
- Where do i watch for alerts? Join the `#tor-alerts` IRC channel! See
  also [how to access alerting history](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#checking-alert-history)
  also [how to access alerting history][]
- How can we use silences to prevent some alerts from firing? See
  [Silencing an alert in advance](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#silencing-an-alert-in-advance) and following
- [Architecture overview](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#design)
- [Alerting philosophy](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#alerting-philosophy)
- [Adding metrics](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#adding-metrics-to-applications)
- [How to add alerts](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#writing-an-alert)
- [Queries cheat sheet](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#queries-cheat-sheet)
- [Alert debugging](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#alert-debugging):
  [Silencing an alert in advance][] and following
- [Architecture overview][]
- [Alerting philosophy][]
- [Adding metrics][]
- [How to add alerts][]
- [Queries cheat sheet][]
- [Alert debugging][]:
  - Alert unit tests
  - Alert routing tests
  - Ensuring the tags required for routing are there
@@ -38,6 +38,18 @@ layer on top (see [Grafana][]).
  - %"TPA-RFC-33-B: Prometheus server merge, more exporters"
  - %"TPA-RFC-33-C: Prometheus high availability, long term metrics, other exporters"

 [Alert debugging]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#alert-debugging
 [Queries cheat sheet]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#queries-cheat-sheet
 [How to add alerts]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#writing-an-alert
 [Adding metrics]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#adding-metrics-to-applications
 [Alerting philosophy]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#alerting-philosophy
 [Architecture overview]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#design
 [Silencing an alert in advance]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#silencing-an-alert-in-advance
 [how to access alerting history]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#checking-alert-history
 [web dashboards section]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus#web-dashboards
 [Grafana service page]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/grafana
 [Prometheus service page]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/prometheus

## Web dashboards

The main Prometheus web interface is available at:
@@ -400,9 +412,11 @@ blackbox exporter to the target at the moment the Prometheus server is scraping
the exporter.

The blackbox exporter is rather peculiar and counter-intuitive, see
the [how to debug the blackbox exporter](#debugging-blackbox-exporter) for
the [how to debug the blackbox exporter][] for
more information.

 [how to debug the blackbox exporter]: #debugging-blackbox-exporter

#### Scrape jobs

In Prometheus's point of view, two information are needed:
@@ -501,9 +515,9 @@ Prometheus targets, except that they define what the blackbox exporter will try
to reach. The targets can be `hostname:port` pairs or URLs, depending on the
nature of the type of check being defined.

See [documentation for targets in the
repository](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/blob/main/targets.d/README.md)
for more details
See [documentation for targets in the repository][] for more details

 [documentation for targets in the repository]: https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/blob/main/targets.d/README.md

## Writing an alert

@@ -527,7 +541,9 @@ Prometheus query that should evaluate to "true" (non-zero) for the
alert to fire.

Here is, for example, the first alert in the [`rules.d/tpa_node.rules`
file](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/blob/21d67a21ce9926b2eeef0e14b04bb317fb5c94c0/rules.d/tpa_node.rules):
file][]:

 [`rules.d/tpa_node.rules` file]: https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/blob/21d67a21ce9926b2eeef0e14b04bb317fb5c94c0/rules.d/tpa_node.rules

```
  - alert: JobDown
@@ -672,7 +688,7 @@ built-in functions][].

 [Prometheus template reference]: https://prometheus.io/docs/prometheus/latest/configuration/template_reference/
 [Alertmanager template reference]: https://prometheus.io/docs/alerting/latest/notifications/
 [limited set of built-in functions]: https://pkg.go.dev/text/template#hdr-Functions
 [Limited set of built-in functions]: https://pkg.go.dev/text/template#hdr-Functions
 [Golang templates]: https://pkg.go.dev/text/template

### Writing a playbook
@@ -840,7 +856,6 @@ space left, to avoid warning about normal write spikes.

[metrics in your application]: #adding-metrics-to-applications
[scraped by Prometheus]: #adding-scrape-targets
[Alerting philosophy]: #alerting-philosophy
[alerting rule]: https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/
[recording rules documentation]: https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/#recording-rules
[aggregation operators]: https://prometheus.io/docs/prometheus/latest/querying/operators/#aggregation-operators
@@ -1024,9 +1039,11 @@ below.

If you can't access the dashboard at all or if the above seems too
complicated, [Grafana][] can be used as a debugging tool for metrics
as well. In the [Explore](https://grafana.torproject.org/explore) section, you can input Prometheus
as well. In the [Explore][] section, you can input Prometheus
metrics, with auto-completion, and inspect the output directly.

 [Explore]: https://grafana.torproject.org/explore

There's also the [Grafana availability dashboard][], see the [Alerting
dashboards][] section for details.