Loading howto/prometheus.md +59 −13 Original line number Original line Diff line number Diff line Loading @@ -72,8 +72,11 @@ Once you have an exporter endpoint (say at curl http://example.com:9090/metrics curl http://example.com:9090/metrics This should return a number of metrics that change (or not) at each This should return a number of metrics that change (or not) at each call. From there on, provide that endpoint to the sysadmins, which call. will follow the next procedure to add the metric to Prometheus. From there on, provide that endpoint to the sysadmins (or someone with access to the external monitoring server), which will follow the procedure below to add the metric to Prometheus. Once the exporter is hooked into Prometheus, you can browse the Once the exporter is hooked into Prometheus, you can browse the metrics directly at: <https://prometheus.torproject.org>. Graphs metrics directly at: <https://prometheus.torproject.org>. Graphs Loading @@ -82,7 +85,41 @@ those need to be created and committed into git by sysadmins to persist, see the [anarcat dashboard directory](https://gitlab.com/anarcat/grafana-dashboards) for more persist, see the [anarcat dashboard directory](https://gitlab.com/anarcat/grafana-dashboards) for more information. information. ## Adding metrics for admins ## Adding targets on the external server Alerts and scrape targets on the external server are managed through a Git repository called [prometheus-alerts](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts). To add a scrape target: 1. clone the repository git clone https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/ cd prometheus-alerts 2. assuming you're adding a node exporter, to add the target: cat > targets.d/node_myproject.yaml <<EOF # scrape the external node exporters for project Foo --- - targets: - targetone.example.com - targettwo.example.com 3. add, commit, and push: git checkout -b myproject git add targets.d git commit -m"add node exporter targets for my project" git push origin -u myproject The last push command should show you the URL where you can submit your merge request. After being merged, the changes should propagate within [4 to 6 hours](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/puppet/#cron-and-scheduling). See also the [targets.d documentation in the git repository](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/tree/main/targets.d). ## Adding targets on the internal server TODO: talk about `scrape_jobs` for in-puppet configurations. TODO: talk about `scrape_jobs` for in-puppet configurations. Loading Loading @@ -118,11 +155,12 @@ Alerting Overview](https://prometheus.io/docs/alerting/latest/overview/) but I h have instead been following [this tutorial](https://ashish.one/blogs/setup-alertmanager/) which was quite have instead been following [this tutorial](https://ashish.one/blogs/setup-alertmanager/) which was quite helpful. helpful. ### Adding alerts ### Adding alerts in Puppet The Alertmanager is currently managed through Puppet, in The Alertmanager can (but currently isn't, on the external server) `profile::prometheus::server::external`. An alerting rule is defined managed through Puppet, in `profile::prometheus::server::external`. like: An alerting rule, in Puppet, is defined like: { { 'name' => 'bridgestrap', 'name' => 'bridgestrap', Loading @@ -146,13 +184,21 @@ like: ], ], }, }, The key part of the alert is the `expr` setting which is a PromQL Note that we might want to move those to Hiera so that we could use expression that, when evaluated to "true" for more than `5m` (the YAML code directly, which would better match the syntax of the actual `for` settings), will fire an error at the Alertmanager. Also note alerting rules. the `team` label which will route the message to the right team. Those routes are defined later, in the `routes` and `receivers` settings. ### Adding alerts through Git, on the external server The external server pulls pulls a [git repository](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/) for alerting and targets regularly. Alerts can be added through that repository by adding a file in the `rules.d` directory, see [rules.d](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/tree/main/rules.d) directory for more documentation on that. Note that alerts (probably?) do not take effect until a sysadmin reloads Prometheus. Note that those might move to separate files and/or Hiera later on. TODO: confirm how rules are deployed. ### Adding alert recipients ### Adding alert recipients Loading Loading
howto/prometheus.md +59 −13 Original line number Original line Diff line number Diff line Loading @@ -72,8 +72,11 @@ Once you have an exporter endpoint (say at curl http://example.com:9090/metrics curl http://example.com:9090/metrics This should return a number of metrics that change (or not) at each This should return a number of metrics that change (or not) at each call. From there on, provide that endpoint to the sysadmins, which call. will follow the next procedure to add the metric to Prometheus. From there on, provide that endpoint to the sysadmins (or someone with access to the external monitoring server), which will follow the procedure below to add the metric to Prometheus. Once the exporter is hooked into Prometheus, you can browse the Once the exporter is hooked into Prometheus, you can browse the metrics directly at: <https://prometheus.torproject.org>. Graphs metrics directly at: <https://prometheus.torproject.org>. Graphs Loading @@ -82,7 +85,41 @@ those need to be created and committed into git by sysadmins to persist, see the [anarcat dashboard directory](https://gitlab.com/anarcat/grafana-dashboards) for more persist, see the [anarcat dashboard directory](https://gitlab.com/anarcat/grafana-dashboards) for more information. information. ## Adding metrics for admins ## Adding targets on the external server Alerts and scrape targets on the external server are managed through a Git repository called [prometheus-alerts](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts). To add a scrape target: 1. clone the repository git clone https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/ cd prometheus-alerts 2. assuming you're adding a node exporter, to add the target: cat > targets.d/node_myproject.yaml <<EOF # scrape the external node exporters for project Foo --- - targets: - targetone.example.com - targettwo.example.com 3. add, commit, and push: git checkout -b myproject git add targets.d git commit -m"add node exporter targets for my project" git push origin -u myproject The last push command should show you the URL where you can submit your merge request. After being merged, the changes should propagate within [4 to 6 hours](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/puppet/#cron-and-scheduling). See also the [targets.d documentation in the git repository](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/tree/main/targets.d). ## Adding targets on the internal server TODO: talk about `scrape_jobs` for in-puppet configurations. TODO: talk about `scrape_jobs` for in-puppet configurations. Loading Loading @@ -118,11 +155,12 @@ Alerting Overview](https://prometheus.io/docs/alerting/latest/overview/) but I h have instead been following [this tutorial](https://ashish.one/blogs/setup-alertmanager/) which was quite have instead been following [this tutorial](https://ashish.one/blogs/setup-alertmanager/) which was quite helpful. helpful. ### Adding alerts ### Adding alerts in Puppet The Alertmanager is currently managed through Puppet, in The Alertmanager can (but currently isn't, on the external server) `profile::prometheus::server::external`. An alerting rule is defined managed through Puppet, in `profile::prometheus::server::external`. like: An alerting rule, in Puppet, is defined like: { { 'name' => 'bridgestrap', 'name' => 'bridgestrap', Loading @@ -146,13 +184,21 @@ like: ], ], }, }, The key part of the alert is the `expr` setting which is a PromQL Note that we might want to move those to Hiera so that we could use expression that, when evaluated to "true" for more than `5m` (the YAML code directly, which would better match the syntax of the actual `for` settings), will fire an error at the Alertmanager. Also note alerting rules. the `team` label which will route the message to the right team. Those routes are defined later, in the `routes` and `receivers` settings. ### Adding alerts through Git, on the external server The external server pulls pulls a [git repository](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/) for alerting and targets regularly. Alerts can be added through that repository by adding a file in the `rules.d` directory, see [rules.d](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/tree/main/rules.d) directory for more documentation on that. Note that alerts (probably?) do not take effect until a sysadmin reloads Prometheus. Note that those might move to separate files and/or Hiera later on. TODO: confirm how rules are deployed. ### Adding alert recipients ### Adding alert recipients Loading