add tons more ci docs, drop references to registry authored by anarcat's avatar anarcat
......@@ -25,10 +25,8 @@ documents frequent questions we might get about the work.
The [GitLab CI quickstart][] should get you started here. Note that
there are some "shared runners" you can already use, and which should
be available to all projects.
TODO: do runners have time limits? should we document how to enable
the shared runners in a project?
be available to all projects. So your main task here is basically to
[write a `.gitlab-ci.yml` file](https://docs.gitlab.com/ee/ci/quick_start/README.html#create-a-gitlab-ciyml-file).
# How-to
......@@ -39,16 +37,27 @@ the shared runners in a project?
There might be too many jobs in the queue. You can monitor the queue
in our [Grafana dashboard](https://grafana.torproject.org/d/QrDJktiMz/gitlab-omnibus).
## Building docker images
TODO: document how to build docker images from GitLab CI. Maybe with
podman or buildah? see below.
## Enabling/disabling runners
If a runner is misbehaving, it might be worth "pausing" it while we
investigate, so that jobs don't all fail on that runner. For this,
head for the [runner admin interface](https://gitlab.torproject.org/admin/runners) and hit the "pause" button on
the runner.
## Registering more runners
## Image security
Anyone can run their own personal runner in their own infrastructure
and register them inside a project on our GitLab instance. For this
you need to first [install a runner](https://docs.gitlab.com/runner/install/) and [register it in
GitLab](https://docs.gitlab.com/runner/register/). But we already have shared runners, if they are not
sufficient, it might be best to request a new one from TPA.
TODO: document how to create and use more secure Docker images. For
example, most images run as root: try to make images run as a regular
user.
## Converting a Jenkins job
Upstream has [generic documentation on how to migrate from Jenkins](https://docs.gitlab.com/ee/ci/migration/jenkins.html)
which could be useful for us. We have yet to write a more complete
guide on how to migrate jobs to GitLab CI.
## Pager playbook
......@@ -93,8 +102,7 @@ cluster, using this command:
ci-runner-01.torproject.org
The `profile::gitlab_runner` Puppet class deploys the GitLab runner
code and hooks it into GitLab. It uses the
[gitlab_ci_runner](https://forge.puppet.com/modules/puppet/gitlab_ci_runner)
code and hooks it into GitLab. It uses the [gitlab_ci_runner](https://forge.puppet.com/modules/puppet/gitlab_ci_runner)
module from Voxpupuli to avoid reinventing the wheel. But before
enabling it on the instance, the following operations need to be
performed:
......@@ -175,15 +183,121 @@ not be fully available.
## Design
TODO: expand on GitLab CI's design and architecture, following [this
checklist](https://bluesock.org/~willkg/blog/dev/auditing_projects.html). See also the [Jenkins section](#jenkins) below for the same
thing about Jenkins.
The CI service is currently being serviced by [Jenkins][], but we are
looking at replacing this with GitLab CI in the [2021
roadmap](roadmap/2021). This section therefore mostly documents how the new
GitLab CI service is built. See [Jenkins section](#jenkins) below for more
information about the old Jenkins service.
### GitLab CI architecture
GitLab CI sits somewhat outside of the main GitLab architecture, in
that it is not featured proeminently int he [GitLab architecture
documentation](https://docs.gitlab.com/ee/development/architecture.html). In practice, it is a core component of GitLab in
that the continuous integration and deployment features of GitLab have
become a key feature and selling point for the project.
GitLab CI works by scheduling "pipelines" which are made of one or
many "jobs", defined in a project's git repository (the
[`.gitlab-ci.yml`](https://docs.gitlab.com/ee/ci/yaml/) file). Those jobs then get picked up by one of
many "runners". Those runners are separate processes, usually running
on a different host than the main GitLab server.
They regularly poll the central GitLab for jobs and execute those
inside an "[executor](https://docs.gitlab.com/runner/executors/README.html)". We currently support only "Docker" as an
executor but are working on different ones, like a custom "podman"
(for more trusted runners, see below) or KVM executor (for foreign
platforms like MacOS or Windows).
What the runner effectively does is basically this:
1. it fetches the git repository of the project
2. it runs a sequence of shell commands on the project inside the
executor (e.g. inside a Docker container) with [specific
environment variables](https://docs.gitlab.com/ee/ci/variables/README.html#gitlab-cicd-environment-variables) populated from the project's settings
3. it collects artifacts and logs and uploads those back to the main
GitLab server
The jobs are therefore affected by the `.gitlab-ci.yml` file but also
the configuration of each project. It's a simple yet powerful design.
### Types of runners
There are three types of runners:
* **shared**: "shared" across all projects, they will pick up any
job from any project
* **group**: those are restricted to run jobs only within a
specific group
* **project**: those will only run job within a specific project
In addition, jobs can be targeted at specific runners by assigning
them a "tag".
### Runner tags
Whether a runner will pick a job depends on a few things:
* if it is a "shared", "project" or "group-"specific runner (above)
* if it has a tag matching the [`tags` field in the configuration](https://docs.gitlab.com/ee/ci/yaml/#tags)
We currently use the following tags:
Some things to look into:
* **architecture**: `amd64`, for example, runs on the normal 64-bit
Intel/AMD architecture, new tags like this may be introduced when
other architectures are supported
* **OS**: `linux` is usually implicit but other tags might eventually
be added for other OS
* **executor** type: `docker`, `KVM`, etc. `docker` are the typical
runners, `KVM` runners are possibly more powerful and can, for
example, run Docker-inside-Docker (DinD)
* **memory** size: `64GB`, `32GB`, `4GB`, etc.
* `privileged`: those containers have actual root access and should
explicitely be able to run `DinD`
* `interactive web terminal`: supports [interactively debugging
jobs](https://docs.gitlab.com/ee/ci/interactive_web_terminal/)
* `fdroid`: provided as a courtesy by the [F-Droid project](https://f-droid.org/)
Use tags in your configuration only if your job can be fullfilled by
only some of those runners. For example, only specify a memory tag if
your job requires a lot of memory.
### Upstream release schedules
GitLab CI is an integral part of GitLab itself and gets released along
with the core releases. GitLab runner is a [separate software
project](https://gitlab.com/gitlab-org/gitlab-runner) but usually gets released alongside GitLab.
### Security
TODO: Some things to look into:
* https://docs.gitlab.com/ee/user/project/new_ci_build_permissions_model.html
* https://docs.gitlab.com/runner/security/
We do not currently trust GitLab runners for security purposes: at
most we trust them to correctly report errors in test suite, but we do
not trust it with compiling and publishing artifacts, so they have a
low value in our trust chain. This might eventually change.
### Image, volume and container storage and caching
GitLab runner creates quite a few containers, volumes and images in
the course of its regular work. Those tend to pile up, unless they get
cleaned. [Upstream suggests](https://docs.gitlab.com/runner/executors/docker.html#clearing-docker-cache) a [fairly naive shell script](https://gitlab.com/gitlab-org/gitlab-runner/blob/master/packaging/root/usr/share/gitlab-runner/clear-docker-cache) to do
this cleanup, but it has a number of issues:
1. it is noisy ([patched locally with this MR](https://gitlab.com/gitlab-org/gitlab-runner/-/merge_requests/2711))
2. it might be too aggressive
So we only run it weekly, and instead run a more "gentle" `docker
system prune` command to cleanup orphaned stuff after 3 days.
Also note that documentation on this inside GitLab runner is
inconsistent at best, see [this other MR](https://gitlab.com/gitlab-org/gitlab-runner/-/merge_requests/2711) and [this issue](https://gitlab.com/gitlab-org/gitlab-runner-docker-cleanup/-/issues/21).
### rootless containers
TODO: consider podman for running containers more securely, and
possibly also to build container images inside GitLab CI, which would
otherwise require docker-in-docker (DinD), unsupported by
......@@ -193,9 +307,33 @@ upstream. some ideas here:
* https://github.com/containers/podman/issues/7982
* https://github.com/jonasbb/podman-gitlab-runner
### Current services
GitLab CI, at TPO, currently runs the following services:
* continuous integration: mostly testing after commit
This is currently used by many teams and is quickly becoming a
critical service.
### Possible services
It could eventually also run those services:
* web page hosting through GitLab pages or the existing static site
system. this is a requirement to replace Jenkins
* continuous deployment: applications and services could be deployed
directly from GitLab CI/CD, for example through a Kubernetes
cluster or just with plain Docker
* artifact publication: tarballs, binaries and Docker images could be
built by GitLab runners and published on the GitLab server (or
elsewhere). this is a requirement to replace Jenkins
## Issues
[File][] or [search][] for issues in the [GitLab issue tracker][search].
[File][] or [search][] for issues in our [GitLab issue
tracker][search]. Upstream has of course an [issue tracker for GitLab
runner](https://gitlab.com/gitlab-org/gitlab-runner/-/issues) and a [project page](https://gitlab.com/gitlab-org/gitlab-runner).
[File]: https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/new
[search]: https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues
......
......