expand on the labels examples (#41868) authored by anarcat's avatar anarcat
......@@ -834,7 +834,7 @@ exporter, the Prometheus scrape configuration, or alerting rule:
| `severity` | `warning` or `critical` | `warning` | `warning` | `warning` |
| `instance` | `host:port` | `web-fsn-01.torproject.org:9100` | `bacula-director-01.torproject.org:9133` | `localhost:9115` |
| `alias` | `host` | `web-fsn-01.torproject.org` | `web-fsn-01.torproject.org` | `web-fsn-01.torproject.org` |
| `target` | target used by blackbox | _not produced_ | _not_produced_ | `www.torproject.org` |
| `target` | target used by blackbox | _not produced_ | _not produced_ | `www.torproject.org` |
[List of team names]: https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/blob/main/.pint.hcl?ref_type=heads#L13-21
......@@ -842,24 +842,49 @@ Some notes about the lines of the table above:
- `team`: which group to contact for this alert, which affects how
alerts get routed. See [List of team names][]
- `severity`: affects alert routing. Use `warning` unless the alert absolutely
needs immediate attention. [TPA-RFC-33][] defines the [alert levels][] as:
- `warning` (new): non-urgent condition, requiring investigation and
fixing, but not immediately, no user-visible impact; example: server needs
to be rebooted
- `critical`: serious condition with disruptive user-visible impact
which requires prompt response; example: donation site returns 500 errors
- `instance`: host name and port that prometheus used for scraping.
- `alias`: FQDN of the host concerned by the scraped metrics. For example, for a
blackbox check, this would be the host that serves an HTTPS website we're
getting information about. For backups, this would be the machine that is
getting backed up.
- `instance`: host name and port that Prometheus used for scraping.
For example, for the node exporter it is port 9100 on the monitored
host, but for other exporters, it might be another host running the
exporter.
Another example, for the blackbox exporter, it is port
9115 on the blackbox exporter (`localhost` by default, but there's a
blackbox exporter running to monitor the Redis tunnel on the [donate
service](service/donate)).
For backups, the exporter is running on the Bacula director, so the
instance is `bacula-director-01.torproject.org:9133`, where the
bacula exporter runs.
- `alias`: FQDN of the host concerned by the scraped metrics.
For example, for a blackbox check, this would be the host that
serves an HTTPS website we're getting information about. For
backups, this would be the FQDN of the machine that is getting
backed up.
This is *not* the same as "`instance` without a `port`", as this
does *not* point to the exporter.
- `target`: in the case of a blackbox alert, the actual target being checked.
Can be for example the full URL, or the smtp host name and port, etc.
- Note that for URLs, we prefer to rely on the blackbox module to determine the
scheme that's used for http/https checks, so we recommend to set the target
without the scheme prefix (e.g. no `https://` prefix). This lets us link
https alerts to http ones in alert inhibitions.
Can be for example the full URL, or the SMTP host name and port, etc.
Note that for URLs, we rely on the blackbox module to determine the
scheme that's used for HTTP/HTTPS checks, so we set the target
without the scheme prefix (e.g. no `https://` prefix). This lets us
link HTTPS alerts to HTTP ones in alert inhibitions.
### Annotations
......
......