although those are limited enough that we use a separate graphing
layer on top (see [[Grafana]]).
[Prometheus]: https://prometheus.io/
Basic design
------------
...
...
@@ -17,12 +19,16 @@ The Prometheus web interface is available at:
<https://prometheus.torproject.org>
A simple query you can try is to pick any metric in the list and click
`Execute`. For example, [this link](https://prometheus1.torproject.org/graph?g0.range_input=2w&g0.expr=node_load5&g0.tab=0) will show the 5-minute load
`Execute`. For example, [this link][] will show the 5-minute load
over the last two weeks for the known servers.
Here you can see, from the [Prometheus overview documentation](https://prometheus.io/docs/introduction/overview/) the
drawing of Prometheus' architecture, showing the push gateway and
exporters adding metrics, service discovery through file_sd and
...
...
@@ -30,13 +36,15 @@ Kubernetes, alerts pushed to the Alertmanager and the various UIs
pulling from Prometheus" />
As you can see, Prometheus is somewhat tailored towards
[Kubernetes](https://kubernetes.io/) but it can be used without it. We're deploying it with
[Kubernetes][] but it can be used without it. We're deploying it with
the `file_sd` discovery mechanism, where Puppet collects all exporters
into the central server, which then scrapes those exporters every
`scrape_interval` (by default 15 seconds). The architecture graph also
shows the Alertmanager which could be used to (eventually) replace our
Nagios deployment.
[Kubernetes]: https://kubernetes.io/
It does not show that Prometheus can federate to multiple instances
and the Alertmanager can be configured with High availability.
...
...
@@ -125,9 +133,15 @@ upstream Puppet module to install Prometheus using backported Debian
packages. The monitoring server itself is defined in
`roles::monitoring`.
The [Prometheus Puppet module](https://github.com/voxpupuli/puppet-prometheus/) was patched to [allow scrape job
collection](https://github.com/voxpupuli/puppet-prometheus/pull/304) and [use of Debian packages for installation](https://github.com/voxpupuli/puppet-prometheus/pull/303). Much
of the initial Prometheus configuration was also documented in [ticket
#29681](https://trac.torproject.org/projects/tor/ticket/29681) and especially [ticket #29388](https://trac.torproject.org/projects/tor/ticket/29388) which investigates
The [Prometheus Puppet module][] was patched to [allow scrape job
collection][] and [use of Debian packages for installation][]. Much of
the initial Prometheus configuration was also documented in
[ticket #29681][] and especially [ticket #29388][] which investigates
storage requirements and possible alternatives for data retention