Skip to content
Snippets Groups Projects
Verified Commit 87cf79f9 authored by anarcat's avatar anarcat
Browse files

GitLab CI runner monitoring documentation (team#41042)

parent 06c207d9
No related branches found
No related tags found
No related merge requests found
......@@ -694,6 +694,10 @@ runner](https://gitlab.com/gitlab-org/gitlab-runner/-/issues) and a [project pag
* [kept artifacts cannot be unkept](https://gitlab.com/gitlab-org/gitlab/-/issues/289954)
* GitLab doesn't track [wait times for jobs](https://gitlab.com/groups/gitlab-org/-/epics/10630), we approximate this
by tracking queue size and with runner-specific metrics like
concurrency limit hits
## Monitoring and testing
To test a runner, it can be registered only with a project, to run
......@@ -717,18 +721,18 @@ example, `ci_pending_builds` shows the size of the queue,
etc. Those are visible in the [GitLab grafana dashboard](https://grafana.torproject.org/d/QrDJktiMz/gitlab-omnibus),
particularly in [this view](https://grafana.torproject.org/d/QrDJktiMz/gitlab-omnibus?orgId=1&refresh=1m&var-node=gitlab-02.torproject.org).
Other metrics might become available in the future: for example,
runners can export their own Prometheus metrics, but currently do
not. They are, naturally, monitored through the `node-exporter` like
all other TPO servers, however.
We may eventually monitor GitLab runners directly; they can be
configured to expose metrics through a Prometheus exporter. The Puppet
module supports this through the `gitlab_ci_runner::metrics_server`
variable, but we would need to hook it into our server as well. See
also [the upstream documentation](https://docs.gitlab.com/runner/monitoring/README.html). Right now it feels the existing
"node"-level and the GitLab-level monitoring in Prometheus is
sufficient.
Runners are, naturally, monitored through the `node-exporter` like
all other TPO servers.
Runners also expose metrics through a built-in Prometheus exporter on
a predefined port, accessible only by the Prometheus server. The
Puppet module supports this through the
`gitlab_ci_runner::metrics_server` variable, but we have rolled our
own thing for now, see [issue 41042](https://gitlab.torproject.org/tpo/tpa/team/-/issues/41042). See also [the upstream
documentation](https://docs.gitlab.com/runner/monitoring/README.html) about self-monitoring.
CI metrics are aggregated in the [GitLab CI Overview Grafana
dashboard](https://grafana.torproject.org/d/fd0b2fb2-88d0-4f85-bc86-16164c083b51/gitlab-ci-overview?orgId=1).
## Backups
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment