anarcat · 817986e9
--- a/howto/prometheus.md
+++ b/howto/prometheus.md
@@ -72,8 +72,11 @@ Once you have an exporter endpoint (say at
    curl http://example.com:9090/metrics

 This should return a number of metrics that change (or not) at each
-call. From there on, provide that endpoint to the sysadmins, which
-will follow the next procedure to add the metric to Prometheus.
+call.
+
+From there on, provide that endpoint to the sysadmins (or someone with
+access to the external monitoring server), which will follow the
+procedure below to add the metric to Prometheus.

 Once the exporter is hooked into Prometheus, you can browse the
 metrics directly at: <https://prometheus.torproject.org>. Graphs
@@ -82,7 +85,41 @@ those need to be created and committed into git by sysadmins to
 persist, see the [anarcat dashboard directory](https://gitlab.com/anarcat/grafana-dashboards) for more
 information.

-## Adding metrics for admins
+## Adding targets on the external server
+
+Alerts and scrape targets on the external server are managed through a
+Git repository called [prometheus-alerts](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts). To add a scrape target:
+
+ 1. clone the repository
+
+        git clone https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/
+        cd prometheus-alerts
+
+ 2. assuming you're adding a node exporter, to add the target:
+
+        cat > targets.d/node_myproject.yaml <<EOF
+        # scrape the external node exporters for project Foo
+        ---
+        - targets:
+          - targetone.example.com
+          - targettwo.example.com
+
+ 3. add, commit, and push:
+
+        git checkout -b myproject
+        git add targets.d
+        git commit -m"add node exporter targets for my project"
+        git push origin -u myproject
+
+The last push command should show you the URL where you can submit
+your merge request.
+
+After being merged, the changes should propagate within [4 to 6
+hours](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/puppet/#cron-and-scheduling).
+
+See also the [targets.d documentation in the git repository](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/tree/main/targets.d).
+
+## Adding targets on the internal server

 TODO: talk about `scrape_jobs` for in-puppet configurations.

@@ -118,11 +155,12 @@ Alerting Overview](https://prometheus.io/docs/alerting/latest/overview/) but I h
 have instead been following [this tutorial](https://ashish.one/blogs/setup-alertmanager/) which was quite
 helpful.

-### Adding alerts
+### Adding alerts in Puppet

-The Alertmanager is currently managed through Puppet, in
-`profile::prometheus::server::external`. An alerting rule is defined
-like:
+The Alertmanager can (but currently isn't, on the external server)
+managed through Puppet, in `profile::prometheus::server::external`.
+
+An alerting rule, in Puppet, is defined like:

        {
          'name' => 'bridgestrap',
@@ -146,13 +184,21 @@ like:
          ],
        },

-The key part of the alert is the `expr` setting which is a PromQL
-expression that, when evaluated to "true" for more than `5m` (the
-`for` settings), will fire an error at the Alertmanager. Also note
-the `team` label which will route the message to the right team. Those
-routes are defined later, in the `routes` and `receivers` settings.
+Note that we might want to move those to Hiera so that we could use
+YAML code directly, which would better match the syntax of the actual
+alerting rules.
+
+### Adding alerts through Git, on the external server
+
+The external server pulls pulls a [git repository](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/) for alerting and
+targets regularly. Alerts can be added through that repository by
+adding a file in the `rules.d` directory, see [rules.d](https://gitlab.torproject.org/tpo/tpa/prometheus-alerts/-/tree/main/rules.d) directory
+for more documentation on that.
+
+Note that alerts (probably?) do not take effect until a sysadmin
+reloads Prometheus.

-Note that those might move to separate files and/or Hiera later on.
+TODO: confirm how rules are deployed.

 ### Adding alert recipients