Loading service/ci.md +16 −12 Original line number Diff line number Diff line Loading @@ -694,6 +694,10 @@ runner](https://gitlab.com/gitlab-org/gitlab-runner/-/issues) and a [project pag * [kept artifacts cannot be unkept](https://gitlab.com/gitlab-org/gitlab/-/issues/289954) * GitLab doesn't track [wait times for jobs](https://gitlab.com/groups/gitlab-org/-/epics/10630), we approximate this by tracking queue size and with runner-specific metrics like concurrency limit hits ## Monitoring and testing To test a runner, it can be registered only with a project, to run Loading @@ -717,18 +721,18 @@ example, `ci_pending_builds` shows the size of the queue, etc. Those are visible in the [GitLab grafana dashboard](https://grafana.torproject.org/d/QrDJktiMz/gitlab-omnibus), particularly in [this view](https://grafana.torproject.org/d/QrDJktiMz/gitlab-omnibus?orgId=1&refresh=1m&var-node=gitlab-02.torproject.org). Other metrics might become available in the future: for example, runners can export their own Prometheus metrics, but currently do not. They are, naturally, monitored through the `node-exporter` like all other TPO servers, however. We may eventually monitor GitLab runners directly; they can be configured to expose metrics through a Prometheus exporter. The Puppet module supports this through the `gitlab_ci_runner::metrics_server` variable, but we would need to hook it into our server as well. See also [the upstream documentation](https://docs.gitlab.com/runner/monitoring/README.html). Right now it feels the existing "node"-level and the GitLab-level monitoring in Prometheus is sufficient. Runners are, naturally, monitored through the `node-exporter` like all other TPO servers. Runners also expose metrics through a built-in Prometheus exporter on a predefined port, accessible only by the Prometheus server. The Puppet module supports this through the `gitlab_ci_runner::metrics_server` variable, but we have rolled our own thing for now, see [issue 41042](https://gitlab.torproject.org/tpo/tpa/team/-/issues/41042). See also [the upstream documentation](https://docs.gitlab.com/runner/monitoring/README.html) about self-monitoring. CI metrics are aggregated in the [GitLab CI Overview Grafana dashboard](https://grafana.torproject.org/d/fd0b2fb2-88d0-4f85-bc86-16164c083b51/gitlab-ci-overview?orgId=1). ## Backups Loading Loading
service/ci.md +16 −12 Original line number Diff line number Diff line Loading @@ -694,6 +694,10 @@ runner](https://gitlab.com/gitlab-org/gitlab-runner/-/issues) and a [project pag * [kept artifacts cannot be unkept](https://gitlab.com/gitlab-org/gitlab/-/issues/289954) * GitLab doesn't track [wait times for jobs](https://gitlab.com/groups/gitlab-org/-/epics/10630), we approximate this by tracking queue size and with runner-specific metrics like concurrency limit hits ## Monitoring and testing To test a runner, it can be registered only with a project, to run Loading @@ -717,18 +721,18 @@ example, `ci_pending_builds` shows the size of the queue, etc. Those are visible in the [GitLab grafana dashboard](https://grafana.torproject.org/d/QrDJktiMz/gitlab-omnibus), particularly in [this view](https://grafana.torproject.org/d/QrDJktiMz/gitlab-omnibus?orgId=1&refresh=1m&var-node=gitlab-02.torproject.org). Other metrics might become available in the future: for example, runners can export their own Prometheus metrics, but currently do not. They are, naturally, monitored through the `node-exporter` like all other TPO servers, however. We may eventually monitor GitLab runners directly; they can be configured to expose metrics through a Prometheus exporter. The Puppet module supports this through the `gitlab_ci_runner::metrics_server` variable, but we would need to hook it into our server as well. See also [the upstream documentation](https://docs.gitlab.com/runner/monitoring/README.html). Right now it feels the existing "node"-level and the GitLab-level monitoring in Prometheus is sufficient. Runners are, naturally, monitored through the `node-exporter` like all other TPO servers. Runners also expose metrics through a built-in Prometheus exporter on a predefined port, accessible only by the Prometheus server. The Puppet module supports this through the `gitlab_ci_runner::metrics_server` variable, but we have rolled our own thing for now, see [issue 41042](https://gitlab.torproject.org/tpo/tpa/team/-/issues/41042). See also [the upstream documentation](https://docs.gitlab.com/runner/monitoring/README.html) about self-monitoring. CI metrics are aggregated in the [GitLab CI Overview Grafana dashboard](https://grafana.torproject.org/d/fd0b2fb2-88d0-4f85-bc86-16164c083b51/gitlab-ci-overview?orgId=1). ## Backups Loading