Skip to content
Snippets Groups Projects
Verified Commit f2d9e8eb authored by anarcat's avatar anarcat
Browse files

document gitlab CI monitoring better

parent c7de797f
No related branches found
No related tags found
No related merge requests found
......@@ -701,15 +701,28 @@ runner](https://gitlab.com/gitlab-org/gitlab-runner/-/issues) and a [project pag
by tracking queue size and with runner-specific metrics like
concurrency limit hits
## Monitoring and testing
## Monitoring and metrics
CI metrics are aggregated in the [GitLab CI Overview Grafana
dashboard](https://grafana.torproject.org/d/fd0b2fb2-88d0-4f85-bc86-16164c083b51/gitlab-ci-overview?orgId=1). It features multiple exporter sources:
1. the GitLab rails exporter which gives us the queue size
2. the GitLab runner exporters, which show many jobs are running in
parallel (see [the upstream documentation](https://docs.gitlab.com/runner/monitoring/README.html))
3. a home made exporter that queries the GitLab database to extract
queue wait times
4. and finally the node exporter to show memory usage, load and disk
usage
Note that not all runners registered on GitLab are directly managed by
TPA, so they might not show up in our dashboards.
## Tests
To test a runner, it can be registered only with a project, to run
non-critical jobs against it. See the [installation section](#Installation) for
details on the setup.
Monitoring is otherwise done through Prometheus, on a need-to basis,
see the [log and metrics](#log-and-metrics) section below.
## Logs and metrics
GitLab runners send logs to `syslog` and `systemd`. They contain minimal
......@@ -718,22 +731,6 @@ Docker image URLs, which do contain usernames. Those end up in
`/var/log/daemon.log`, which gets rotated daily, with a one-week
retention.
The GitLab instance exports a set of metrics to monitor CI. For
example, `ci_pending_builds` shows the size of the queue,
`ci_running_builds` shows the number of currently running builds,
etc. Those are visible in the [GitLab grafana dashboard](https://grafana.torproject.org/d/QrDJktiMz/gitlab-omnibus),
particularly in [this view](https://grafana.torproject.org/d/QrDJktiMz/gitlab-omnibus?orgId=1&refresh=1m&var-node=gitlab-02.torproject.org).
Runners are, naturally, monitored through the `node-exporter` like
all other TPO servers.
Runners also expose metrics through a built-in Prometheus exporter on
a predefined port, accessible only by the Prometheus server. See also
[the upstream documentation](https://docs.gitlab.com/runner/monitoring/README.html) about self-monitoring.
CI metrics are aggregated in the [GitLab CI Overview Grafana
dashboard](https://grafana.torproject.org/d/fd0b2fb2-88d0-4f85-bc86-16164c083b51/gitlab-ci-overview?orgId=1).
## Backups
This service requires no backups: all configuration should be
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment