... | ... | @@ -170,13 +170,38 @@ Basically, Prometheus is similar to Munin in many ways: |
|
|
without sending duplicate alerts - `munin-limits` can only run on a
|
|
|
single server
|
|
|
|
|
|
## Push metrics to the Pushgateway
|
|
|
|
|
|
The [Pushgateway][] is setup on the secondary Prometheus server
|
|
|
(`prometheus2`). Note that you might not need to use the Pushgateway,
|
|
|
see the [article about pushing metrics](https://prometheus.io/docs/practices/pushing/) before going down this route.
|
|
|
|
|
|
The Pushgateway is fairly particular: it listens on port 9091 and gets
|
|
|
data through a fairly simple [curl-friendly commandline](https://github.com/prometheus/pushgateway#command-line)
|
|
|
[API](https://github.com/prometheus/pushgateway#api). We have found that, once installed, this command just "does
|
|
|
the right thing", more or less:
|
|
|
|
|
|
echo 'some_metrics{foo="bar"} 3.14 | curl --data-binary @- http://localhost:9091/metrics/job/jobtest/instance/instancetest
|
|
|
|
|
|
To confirm the data was injected by the Push gateway, this can be
|
|
|
done:
|
|
|
|
|
|
curl localhost:9091/metrics | head
|
|
|
|
|
|
The Pushgateway is scraped, like other Prometheus jobs, every minute,
|
|
|
with metrics kept for a year, at the time of writing. This is
|
|
|
configured, inside Puppet, in `profile::prometheus::server::external`.
|
|
|
|
|
|
Note that it's [not possible to push timestamps](https://github.com/prometheus/pushgateway#about-timestamps) into the
|
|
|
Pushgateway, so it's not useful to ingest past historical data.
|
|
|
|
|
|
# Reference
|
|
|
|
|
|
## Installation
|
|
|
|
|
|
### Puppet implementation
|
|
|
|
|
|
Every node is configured as a `node-exporter` through the
|
|
|
Every TPA server is configured as a `node-exporter` through the
|
|
|
`roles::monitored` that is included everywhere. The role might
|
|
|
eventually be expanded to cover alerting and other monitoring
|
|
|
resources as well. This role, in turn, includes the
|
... | ... | @@ -202,6 +227,21 @@ policies. |
|
|
[allow scrape job collection]: https://github.com/voxpupuli/puppet-prometheus/pull/304
|
|
|
[Prometheus Puppet module]: https://github.com/voxpupuli/puppet-prometheus/
|
|
|
|
|
|
### Pushgateway
|
|
|
|
|
|
The [Pushgateway][] was configured on the external Prometheus server
|
|
|
to allow for the metrics people to push their data inside Prometheus
|
|
|
without having to write a Prometheus exporter inside Collector.
|
|
|
|
|
|
[Pushgateway]: https://github.com/prometheus/pushgateway
|
|
|
|
|
|
This was done directly inside the
|
|
|
`profile::prometheus::server::external` class, but could be moved to a
|
|
|
separate profile if it needs to be deployed internally. It is assumed
|
|
|
that the gateway script will run directly on `prometheus2` to avoid
|
|
|
setting up authentication and/or firewall rules, but this could be
|
|
|
changed.
|
|
|
|
|
|
### Manual node configuration
|
|
|
|
|
|
External services can be monitored by Prometheus, as long as they
|
... | ... | @@ -321,6 +361,14 @@ Nagios deployment. |
|
|
It does not show that Prometheus can federate to multiple instances
|
|
|
and the Alertmanager can be configured with High availability.
|
|
|
|
|
|
## Pushgateway
|
|
|
|
|
|
The [Pushgateway][] is a separate server from the main Prometheus
|
|
|
server that is designed to "hold" onto metrics for ephemeral jobs that
|
|
|
would otherwise be around long enough for Prometheus to scrape their
|
|
|
metrics. We use it as a workaround to bridge Metrics data with
|
|
|
Prometheus/Grafana.
|
|
|
|
|
|
## Issues
|
|
|
|
|
|
There is no issue tracker specifically for this project, [File][new-ticket] or
|
... | ... | |