add example of vector matching authored by anarcat's avatar anarcat
......@@ -993,12 +993,21 @@ day:
sum_over_time(ALERTS{alertname="MemFullSoon"}[1d:1s])
[HTTP Status code associated with blackbox probe failures][]
sort((probe_success{job="blackbox_https_200"} < 1) + on (alias) group_right probe_http_status_code)
The latter is an example of [vector matching](https://prometheus.io/docs/prometheus/latest/querying/operators/#vector-matching), which allows you to
"join" multiple metrics together, in this case failed probes
(`probe_success < 1`) with their status code (`probe_http_status_code`).
[availability dashboard]: https://grafana.torproject.org/d/adwbl8mxnaneoc/availability?var-alertstate=All
[Currently firing alerts]: https://prometheus.torproject.org/graph?g0.expr=ALERTS{alertstate%3D"firing"}
[Unreachable hosts]: https://prometheus.torproject.org/graph?g0.expr=up{job%3D"node"}+!%3D+1
[How much time was the given service (`node` job, in this case) `up` in the past period (`30d`)]: https://prometheus.torproject.org/graph?g0.expr=avg(avg_over_time(up{job%3D"node"}[30d]))
[How many hosts are online at any given point in time]: https://prometheus.torproject.org/graph?g0.expr=sum(count(up%3D=1))/sum(count(up))+by+(alias)
[How long did an alert fire over a given period of time]: https://prometheus.torproject.org/graph?g0.expr=sum_over_time(ALERTS{alertname%3D"MemFullSoon"}[1d:1s])
[HTTP Status code associated with blackbox probe failures]: https://prometheus.torproject.org/classic/graph?g0.range_input=1h&g0.expr=sort%28%28probe_success%7Bjob%3D%22blackbox_https_200%22%7D+%3C+1%29+%2B+on+%28alias%29+group_right+probe_http_status_code%29&g0.tab=1
### Inventory
......
......