diff --git a/howto/gitlab.md b/howto/gitlab.md index 488749637d3b975844d1ffa2b37a54c1454e9e22..917bf88f609482782d85b50a99e94cf160afd3cf 100644 --- a/howto/gitlab.md +++ b/howto/gitlab.md @@ -110,6 +110,38 @@ which look like a styled, curly, and closing quotation mark `â€`. All CI documentation resides in a different document see [service/ci](service/ci). +## Container registry operations + +### Logging in + +To upload content to the registry, you first need to login. This can +be done with the `login` command: + + podman login + +This will ask you for your GitLab username and a password, for which +you should use a [personal access token](https://gitlab.torproject.org/-/profile/personal_access_tokens). + +### Uploading an image + +Assuming you already have an image built (below we have it labeled +with `containers.torproject.org/anarcat/test/airsonic-test`), you can +upload it with: + + podman push containers.torproject.org/anarcat/test/airsonic-test containers.torproject.org/anarcat/test + +Notice the two arguments: the first is the label of the image to +upload and the second is *where* to upload it, or "destination". The +destination is made of two parts, the first component is the host name +of the container registry (in our case `containers.torproject.org`) +and the second part is the path to the project to upload into (in our +case [`anarcat/test`](https://gitlab.torproject.org/anarcat/test). + +The uploaded container image should appear under Deploy -> Container +Registry in your project. In the above case, it is in: + +<https://gitlab.torproject.org/anarcat/test/container_registry/4> + ## Email interactions You can interact with GitLab by email too. @@ -615,6 +647,10 @@ To limit this to `job.log`, of course, you can do: find -name "job.log" -mtime +14 -print0 | du --files0-from=- -c -h | tee find-mtime+14-joblog-du.log +If we ran out of space on the object storage because of the GitLab +registry, consider [purging untagged manifests](https://docs.gitlab.com/ee/administration/packages/container_registry.html#removing-untagged-manifests-and-unreferenced-layers) by tweaking the +cron job defined in `profile::gitlab::app` in Puppet. + ### Incoming email routing Incoming email get routed through either eugeni or the submission @@ -697,6 +733,108 @@ Delivered mail 641c797273ba1_86be948d03829@gitlab-02.mail (7.2ms) [issue 139]: https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/139 +### Gitlab registry troubleshooting + +If something goes with the GitLab Registry feature, you should first +look at the logs in: + + tail -f /var/log/gitlab/registry/current /var/log/gitlab/nginx/gitlab_registry_*.log /var/log/gitlab/gitlab-rails/production.log + +The first one might be the one with more relevant information, but is +the hardest to parse, as it's this weird "date {JSONBLOB}" format that +no human or machine can parse. + +You can restart *just* the registry with: + + gitlab-ctl restart registry + +A misconfiguration of the object storage backend will look like this +when uploading a container: + + Error: trying to reuse blob sha256:61581d479298c795fa3cfe95419a5cec510085ec0d040306f69e491a598e7707 at destination: pinging container registry containers.torproject.org: invalid status code from registry 503 (Service Unavailable) + +The registry logs might have something like this: + +``` +2023-07-18_21:45:26.21751 time="2023-07-18T21:45:26.217Z" level=info msg="router info" config_http_addr="127.0.0.1:5000" config_http_host= config_http_net= config_http_prefix= config_http_relative_urls=true correlation_id=01H5NFE6E94A566P4EZG2ZMFMT go_version=go1.19.8 method=HEAD path="/v2/anarcat/test/blobs/sha256:61581d479298c795fa3cfe95419a5cec510085ec0d040306f69e491a598e7707" root_repo=anarcat router=gorilla/mux vars_digest="sha256:61581d479298c795fa3cfe95419a5cec510085ec0d040306f69e491a598e7707" vars_name=anarcat/test version=v3.76.0-gitlab +2023-07-18_21:45:26.21774 time="2023-07-18T21:45:26.217Z" level=info msg="authorized request" auth_project_paths="[anarcat/test]" auth_user_name=anarcat auth_user_type=personal_access_token correlation_id=01H5NFE6E94A566P4EZG2ZMFMT go_version=go1.19.8 root_repo=anarcat vars_digest="sha256:61581d479298c795fa3cfe95419a5cec510085ec0d040306f69e491a598e7707" vars_name=anarcat/test version=v3.76.0-gitlab +2023-07-18_21:45:26.30401 time="2023-07-18T21:45:26.303Z" level=error msg="unknown error" auth_project_paths="[anarcat/test]" auth_user_name=anarcat auth_user_type=personal_access_token code=UNKNOWN correlation_id=01H5NFE6CZBE49BZ6KBK4EHSJ1 detail="SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method.\n\tstatus code: 403, request id: 17731468F69A0F79, host id: dd9025bab4ad464b049177c95eb6ebf374d3b3fd1af9251148b658df7ac2e3e8" error="unknown: unknown error" go_version=go1.19.8 host=containers.torproject.org method=HEAD remote_addr=64.18.183.94 root_repo=anarcat uri="/v2/anarcat/test/blobs/sha256:a55f9a4279c12800590169f7782b956e5c06ec88ec99c020dd111a7a1dcc7eac" user_agent="containers/5.23.1 (github.com/containers/image)" vars_digest="sha256:a55f9 +``` + +If you suspect the object storage backend to be the problem, you +should try to communicate with the MinIO server by configuring the +`rclone` client on the GitLab server and trying to manipulate the +server. Look for the access token in `/etc/gitlab/gitlab.rb` and use +it to configure `rclone` like this: + + rclone config create minio s3 provider Minio endpoint https://minio.torproject.org:9000/ region dallas access_key_id gitlab-registry secret_access_key REDACTED + +Then you can list the registry bucket: + + rclone ls minio:gitlab-registry/ + +See how to [Use rclone as an object storage client](service/minio#use-rclone-as-an-object-storage-client) for more ideas. + +The above may reproduce the above error from the registry: + + SignatureDoesNotMatch: The request signature we calculated does not match the signature you provided. Check your key and signing method. + +That is either due to an incorrect access key or bucket. An error that +was made during the original setup was to treat `gitlab/registry` as a +bucket, while it's a subdirectory... This was fixed by switching to +`gitlab-registry` as a bucket name. Another error we had was to use +`endpoint` instead of `regionendpoint`. + +Another tweak that was done was to set a region in MinIO. Before the +right region was set and matching in the configuration, we had this +error in the registry logs: + + 2023-07-18_21:04:57.46099 time="2023-07-18T21:04:57.460Z" level=fatal msg="configuring application: 1 error occurred:\n\t* validating region provided: dallas\n\n" + +As a last resort, you can revert back to the [filesystem storage](https://docs.gitlab.com/ee/administration/packages/container_registry.html#use-file-system) +by commenting out the `storage => { ... 's3' ... }` block in +`profile::gitlab::app` and adding a line in the `gitlab_rails` blob +like: + + registry_path => '/var/opt/gitlab/gitlab-rails/shared/registry', + +Note that this is a risky operation, as you might end up with a "split +brain" where some images are on the filesystem, and some on object +storage. Warning users with maintenance announcement on the GitLab +site might be wise. + +In the same section, you can [disable the registry by default](https://docs.gitlab.com/ee/administration/packages/container_registry.html#disable-container-registry-for-new-projects-site-wide) on +all projects with: + + gitlab_default_projects_features_container_registry => false, + +... or [disable it site-wide](https://docs.gitlab.com/ee/administration/packages/container_registry.html#disable-container-registry-site-wide) with: + + registry => { + enable => false + # [...] + } + +Note that the `registry` configuration is stored inside the Docker +Registry `config.yaml` file as a single line that looks like JSON. You +*may* think it's garbled and the reason why things don't work, but it +isn't, that is valid YAML, just harder to parse. Blame `gitlab-ctl`'s +Chef cookbook on that... A non-mangled version of the working config +would look like: + +``` +storage: + s3: + accesskey: gitlab-registry + secretkey: REDACTED + region: dallas + regionendpoint: https://minio.torproject.org:9000/ + bucket: gitlab-registry +``` + +Another option that was explored while setting up the registry is +enabling the [debug server](https://docs.gitlab.com/ee/administration/packages/container_registry.html#enable-the-registry-debug-server). + ## Disaster recovery In case the entire GitLab machine is destroyed, a new server should be @@ -1058,6 +1196,66 @@ Puppet class. The following GitLab settings were added: The virtual host for the `pages.torproject.net` domain was configured through the `profile::gitlab::web` class. +### GitLab registry + +The GitLab registry was setup first by deploying an object storage +server (see [minio](service/minio)). An access key was created with: + + mc admin user svcacct add admin gitlab --access-key gitlab-registry + +... and the secret key stored in Trocla. + +Then the config was injected in the `profile::gitlab::app` class, +mostly inline. The registry itself is configured through the +`profile::gitlab::registry` class, so that it could possibly be moved +onto its own host. + +That configuration was filled with many perils, partly documented in +[tpo/tpa/gitlab#89](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/89). One challenge was to get everything working at +once. The software itself is the [Docker Registry](https://docs.docker.com/registry/) [shipped with +GitLab Omnibus](https://docs.gitlab.com/ee/administration/packages/container_registry.html), and it's configured through Puppet, which passes +the value to the `/etc/gitlab/gitlab.rb` file which *then* writes the +final configuration into `/var/opt/gitlab/registry/config.yml`. + +We take the [separate bucket](https://docs.gitlab.com/ee/administration/object_storage.html#use-separate-buckets) approach in that each service using +object storage has its own bucket assigned. This required a special +policy to be applied to the `gitlab` MinIO user: + + { + "Version": "2012-10-17", + "Statement": [ + { + "Action": [ + "s3:*" + ], + "Effect": "Allow", + "Resource": [ + "arn:aws:s3:::gitlab*" + ], + "Sid": "BucketAccessForUser" + } + ] + } + +That is the policy called `gitlab-star-bucket-policy` which grants +access to all buckets prefixed with `gitlab` (as opposed to only the +`gitlab` bucket itself). + +It might be possible to manage the Docker registry software and +configuration directly from Puppet, with Debian package, but that +configuration is actually [deprecated since 15.8 and unsupported in +GitLab 16](https://docs.gitlab.com/ee/administration/packages/container_registry.html#use-an-external-container-registry-with-gitlab-as-an-auth-endpoint). I explained our rationale on why this could be +interesting in the [relevant upstream issue](https://gitlab.com/gitlab-org/container-registry/-/issues/958#note_1476978502). + +We have created a `registry` user on the host because that's what +GitLab expects, but it might be possible to use a different, less +generic username by following [this guide](https://docs.gitlab.com/omnibus/settings/configuration.html#specify-numeric-user-and-group-identifiers). + +A cron job runs every Saturday to clean up unreferenced +layers. [Untagged manifests](https://docs.gitlab.com/ee/administration/packages/container_registry.html#removing-untagged-manifests-and-unreferenced-layers) are *not* purged even if invisible, as +we feel maybe those would result in needless double-uploads. If we do +run out of disk space on images, that is a policy we could implement. + ## SLA <!-- this describes an acceptable level of service for this service -->