Skip to content
Snippets Groups Projects
gitlab.md 80.8 KiB
Newer Older
[GitLab](https://gitlab.com/) is *a web-based DevOps lifecycle tool that provides a
Git-repository manager providing wiki, issue-tracking and continuous
integration/continuous deployment pipeline features, using an
open-source license, developed by GitLab Inc* ([Wikipedia](https://en.wikipedia.org/wiki/GitLab)). Tor
uses GitLab mainly for issue tracking, wiki hosting and code review
for now, at <https://gitlab.torproject.org>, after migrating from
anarcat's avatar
anarcat committed
[howto/trac](howto/trac).
Note that continuous integration is documented separately, in [the CI page](service/ci).
[[_TOC_]]

# Tutorial

<!-- simple, brainless step-by-step instructions requiring little or -->
<!-- no technical background -->

## How to get an account?
You might already *have* an account! If you were active on Trac, your
account was migrated with the same username and email address as Trac,
unless you have an LDAP account, in which case that was used. So head
over to the [password reset page](https://gitlab.torproject.org/users/password/new) to get access to your account.

If your account was *not* migrated, send a mail to
<gitlab-admin@torproject.org> to request a new one.

If you did not have an account in Trac and want a new account, you
should request a new one at <https://gitlab.onionize.space/>.
## How to report an issue in Tor software?
You first need to figure out which project the issue resides in. The
[project list][] is a good place to get started. Here are a few quick
links for popular projects:
[project list]: https://gitlab.torproject.org/tpo
 * [core tor](https://gitlab.torproject.org/tpo/core/tor): [issues](https://gitlab.torproject.org/tpo/core/tor/-/issues), [new issue](https://gitlab.torproject.org/tpo/core/tor/-/issues/new)
 * [Tor Browser](https://gitlab.torproject.org/tpo/applications/tor-browser): [issues](https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues), [new issue](https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/new)
 * [gitlab](https://gitlab.torproject.org/tpo/tpa/gitlab): [issues](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues), [new issue](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/new)
If you do not have a GitLab account or can't figure it out for any
reason, you can also use the mailing lists. The
<tor-dev@lists.torproject.org> mailing list is the best for now.
## How to report an issue in the bugtracker itself?

If you have access to GitLab, you can [file a new issue][File] after
you have [searched the GitLab project for similar bugs][search]. 

If you do *not* have access to GitLab, you can email
<gitlab-admin@torproject.org>.

### Note about confidential issues

Note that you can mark issues as "confidentials" which will make them
private to the members of the project the issue is reported on (the
"developers" group and above, specifically).

Keep in mind, however, that it is still possible issue information
gets leaked in cleartext, however. For example, GitLab [sends email
notifications in cleartext for private issue](https://gitlab.com/gitlab-org/gitlab/-/issues/5816), an known upstream
issue. (We have [decided we cannot fix this ourselves in GitLab for
now](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/23).) Some repositories might also have "web hooks" that notify
IRC bots in clear text as well, although at the time of writing all
projects are correctly configured.

## How to contribute code?

As reporting an issue, you first need to figure out which project you
are working on in the GitLab [project list][]. Then, if you are not
familiar with merge requests, you should read the [merge requests
introduction](https://gitlab.torproject.org/help/user/project/merge_requests/getting_started.md) in the GitLab documentation. If you are unfamiliar
with merge requests but familiar with GitHub's pull requests, those
are similar.

anarcat's avatar
anarcat committed
Note that we do not necessarily use merge requests in all teams yet,
and Gitolite still has the canonical version of the code. See [issue
36][] for a followup on this.

[issue 36]: https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/36

Also note that different teams might have different workflows. If a
team has a special workflow that diverges from the one here, it should
be documented here. Those are the workflows we know about:

 * [Network Team](https://gitlab.torproject.org/tpo/core/tor/-/wikis/NetworkTeam/GitlabReviews)
anarcat's avatar
anarcat committed
 * [Web Team](https://gitlab.torproject.org/tpo/web/community/-/wikis/Git-flow-and-merge-requests)
 * Bridge DB: merge requests
If you do not have access to GitLab, please use one of the mailing
lists: <tor-dev@lists.torproject.org> would be best.

## How to quote a comment in a reply?

The "Reply" button only creates a new comment without any quoted text
by default.  It seems the solution to that is currently highlighting
the text to quote and then pressing the `r`-key. See also the [other
keyboard shortcuts](https://docs.gitlab.com/ee/user/shortcuts.html).

Alternatively, you can copy-paste the text in question in the comment
form, select the pasted text, and hit the `Insert a quote` button
which look like a styled, curly, and closing quotation mark `”`.

anarcat's avatar
anarcat committed
## Continuous Integration (CI)

All CI documentation resides in a different document see
[service/ci](service/ci).

anarcat's avatar
anarcat committed
You can interact with GitLab by email too. 

anarcat's avatar
anarcat committed
Clicking on the project issues gives a link at the bottom of the page,
which says say "Email a new issue to this project".

That link should go into the "To" field of your email. The email
subject becomes the title of the issue and the body the
description. You can use shortcuts in the body, like `/assign @foo`,
`/estimate 1d`, etc.

See [the upstream docs for more details](https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#new-issue-via-url-with-prefilled-fields).

### Commenting on an issue
anarcat's avatar
anarcat committed

If you just reply to the particular comment notification you received
by email, as you would reply to an email in a thread, that comment
will show up in the issue.

You need to have email notifications enabled for this to work,
naturally.
You can also add a new comment to any issue by copy-pasting the
issue-specific email address in the right sidebar (labeled "Issue
email", [introduced in GitLab 13.8](https://gitlab.com/gitlab-org/gitlab/-/issues/18816)).
anarcat's avatar
anarcat committed
This also works with shortcuts like `/estimate 1d` or `/spend
-1h`. Note: for those you won't get notification emails back, though,
while for others like `/assign @foo` you would.
See [the upstream docs for more details](https://docs.gitlab.com/ee/administration/reply_by_email.html).

anarcat's avatar
anarcat committed
### Quick status updates by email
anarcat's avatar
anarcat committed
There are a bunch of [quick actions](https://gitlab.torproject.org/help/user/project/quick_actions.md) available which are handy to
update an issue. As mentioned above they can be sent by email as well,
both within a comment (be it as a reply to a previous one or in a new
one) or just instead of it. So, for example, if you want to update the
amount of time spent on ticket $foo by one hour, find any notification
email for that issue and reply to it by replacing any quoted text with
`/spend 1h`.
anarcat's avatar
anarcat committed
## How to migrate a Git repository from legacy to GitLab?

See the [git documentation for this procedure](howto/git#how-to-migrate-a-git-repository-from-legacy-to-gitlab).
## How to mirror a Git repository from legacy to GitLab?

See the [git documentation for this procedure](howto/git#how-to-migrate-a-git-repository-from-legacy-to-gitlab).

anarcat's avatar
anarcat committed
## How to mirror a Git repository from GitLab to GitHub

Some repositories are mirrored to THE [torproject organization on
GitHub](https://github.com/torproject). This section explains how that works and how to create a
new mirror from GitLab. In this example, we're going to mirror the
[tor browser manual](https://gitlab.torproject.org/tpo/web/manual).

 1. head to the "Mirroring repositories" section of the
    [settings/repository](https://gitlab.torproject.org/tpo/web/manual/-/settings/repository) part of the project

 2. as a Git repository URL, enter:
 
        ssh://git@github.com/torproject/tb-manual.git

 3. click "detect host keys"

 4. choose "SSH" as the "Authentication method"

 5. don't check any of the boxes, click "Mirror repository"

 6. the page will reload and show the mirror in the list of "Mirrored
    repositories". click the little "paperclip" icon which says "Copy
    SSH public key"

 7. head over to https://github.com and authenticate with
    `torproject-pusher`, password is in `external-services-git`, in
    the password manager

 8. go to [settings/keys](https://github.com/settings/keys) and hit [new SSH key](https://github.com/settings/ssh/new)

 9. paste the public key in the bigger field, as a title, use the URL
    of the repository, for example:
    
        Title: https://gitlab.torproject.org/tpo/web/manual mirror key
        Key: ssh-rsa AAAA[...]

 10. click "Add SSH key"

 11. to speed up the process, you can [import the repository in
     GitHub](https://github.com/new/import), otherwise create a [new repository](https://github.com/new). in *both*
     cases make sure you change the namespace from the default
     (`torproject-pusher`, which is incorrect) to the `torproject`
     namespace (which is correct)

 12. then hit the "reload" button in the repositories mirror list

If there is an error, it will show up as a little red "Error"
button. Hovering your mouse over the button will show you the error.

anarcat's avatar
anarcat committed
## How to find the right emoji?

It's possible to add "reaction emojis" to comments and issues and
merge requests in GitLab. Just hit the little smiley face and a dialog
will pop up. You can then browse through the list and pick the right
emoji for how you feel about the comment, but remember to be nice!

It's possible you get lost in the list. You can type the name of the
emoji to restrict your search, but be warned that some emojis have
[particular, non-standard names](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/29057) that might not be immediately
obvious. For example, `🎉`, `U+1F389 PARTY POPPER`, is found as
`tada` in the list! See [this upstream issue for more details](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/29057).

<a id="hooking-up-a-project-with-the-bots" />

## Publishing notifications on IRC
anarcat's avatar
anarcat committed

By default, new projects do *not* have notifications setup in
`#tor-bots` like all the others. To do this, you need to configure a
"Webhook", in the `Settings -> Webhooks` section of the project. The
URL should be:

anarcat's avatar
anarcat committed
    https://kgb-bot.torproject.org/webhook/
anarcat's avatar
anarcat committed

... and you should select the notifications you wish to see in
`#tor-bots`. You can also enable notifications to other channels by
anarcat's avatar
anarcat committed
adding more parameters to the URL, like (say)
`?channel=tor-foo`. Important note: do not try to put the `#` in
the channel name, or if you do, URL-encode it (e.g. like `%23tor-foo`),
otherwise this will silently fail to change the target channel. Other
parameters are documented the [KGB documentation](https://salsa.debian.org/kgb-team/kgb/-/wikis/usage).
anarcat's avatar
anarcat committed

Note that GitLab admins might be able to configure [system-wide
hooks](https://gitlab.torproject.org/help/system_hooks/system_hooks) in [the admin section](https://gitlab.torproject.org/admin/hooks), although it's not entirely clear
how does relate to the per-project hooks so those have not been
enabled. Furthermore, it is possible for GitLab admins with root
access to enable webhooks on *all* projects, with the [webhook rake
task](https://docs.gitlab.com/ee/raketasks/web_hooks.html#webhooks). For example, running this on the GitLab server (currently
`gitlab-02`) will enable the above hook on all repositories:

    sudo gitlab-rake gitlab:web_hook:add URL='https://kgb-bot.torproject.org/webhook/'
anarcat's avatar
anarcat committed

Note that by default, the rake task only enables `Push` events. You
need the following patch to enable others:

    modified   lib/tasks/gitlab/web_hook.rake
    @@ -10,7 +10,19 @@ namespace :gitlab do
           puts "Adding webhook '#{web_hook_url}' to:"
           projects.find_each(batch_size: 1000) do |project|
             print "- #{project.name} ... "
    -        web_hook = project.hooks.new(url: web_hook_url)
    +        web_hook = project.hooks.new(
    +          url: web_hook_url,
    +          push_events: true,
    +          issues_events: true,
    +          confidential_issues_events: false,
    +          merge_requests_events: true,
    +          tag_push_events: true,
    +          note_events: true,
    +          confidential_note_events: false,
    +          job_events: true,
    +          pipeline_events: true,
    +          wiki_page_events: true,
    +        )
             if web_hook.save
               puts "added".color(:green)
             else

See also the [upstream issue](https://gitlab.com/gitlab-org/gitlab/-/issues/17966) and [our GitLab issue 7](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/7) for
details.

anarcat's avatar
anarcat committed
You can also remove a given hook from all repos with:

    sudo gitlab-rake gitlab:web_hook:rm URL='https://kgb-bot.torproject.org/webhook/'

And, finally, list all hooks with:

    sudo gitlab-rake gitlab:web_hook:list

anarcat's avatar
anarcat committed
## Setting up two-factor authentication (2FA)

We strongly recommend you enable two-factor authentication on
GitLab. This is [well documented in the GitLab manual](https://gitlab.torproject.org/help/user/profile/account/two_factor_authentication.md#two-factor-authentication), but basically:

 1. first, pick a 2FA "app" (and optionally a hardware token) if you
    don't have one already

 2. head to your [account settings](https://gitlab.torproject.org/profile/account)

 3. register your 2FA app and save the recovery codes somewhere. if
    you need to enter a URL by hand, you can scan the qrcode with your
    phone or create one by following this format:

        otpauth://totp/$ACCOUNT?secret=$KEY&issuer=gitlab.torproject.org

    where...

      * `$ACCOUNT` is the `Account` field in the 2FA form
      * `$KEY` is the `Key` field in the 2FA form, without spaces

 4. register the 2FA hardware token if available

GitLab requires a 2FA "app" even if you intend to use a hardware
token. The 2FA "app" must implement the TOTP protocol, for example the
[Google Authenticator](https://play.google.com/store/apps/details?id=com.google.android.apps.authenticator2) or a free alternative (for example [free OTP
plus](https://github.com/helloworld1/FreeOTPPlus/), see also this [list from the Nextcloud project](https://github.com/nextcloud/twofactor_totp#readme)). The
hardware token must implement the U2F protocol, which is supported by
security tokens like the [YubiKey](https://en.wikipedia.org/wiki/YubiKey), [Nitrokey](https://www.nitrokey.com/), or similar.
anarcat's avatar
anarcat committed
## Deleting sensitive attachments

If a user uploaded a secret attachment by mistake, just deleting the
issue is not sufficient: it turns out that doesn't remove the
attachments from disk!

To fix this, ask a sysadmin to find the file in the
`/var/opt/gitlab/gitlab-rails/uploads/` directory. Assuming the
attachment URL is:

<https://gitlab.torproject.org/anarcat/test/uploads/7dca7746b5576f6c6ec34bb62200ba3a/openvpn_5.png>

There should be a "hashed" directory and a hashed filename in there,
which looks something like:

    ./@hashed/08/5b/085b2a38876eeddc33e3fbf612912d3d52a45c37cee95cf42cd3099d0a3fd8cb/7dca7746b5576f6c6ec34bb62200ba3a/openvpn_5.png

The second directory (`7dca7746b5576f6c6ec34bb62200ba3a` above) is the
one visible in the attachment URL. The last part is the actual
attachment filename, but since those can overlap between issues, it's
safer to look for the hash. So to find the above attachement, you
should use:

    find /var/opt/gitlab/gitlab-rails/uploads/ -name 7dca7746b5576f6c6ec34bb62200ba3a

And delete the file in there. The following should do the trick:

    find /var/opt/gitlab/gitlab-rails/uploads/ -name 7dca7746b5576f6c6ec34bb62200ba3a | sed 's/^/rm /' > delete.sh

Verify `delete.sh` and run it if happy.

Note that GitLab is working on an [attachment manager](https://gitlab.com/gitlab-org/gitlab/-/issues/16229) that should
allow web operators to delete old files, but it's unclear how or when
this will be implemented, if ever.

anarcat's avatar
anarcat committed
## Publishing GitLab pages

GitLab features a way to publish websites directly from the continuous
integration pipelines, called [GitLab pages](https://docs.gitlab.com/ee/user/project/pages/). Complete
anarcat's avatar
anarcat committed
documentation on how to publish such pages is better served by the
official documentation, but creating a `.gitlab-ci.yml` should get you
rolling. For example, this will publish a `hugo` site:

    image: registry.gitlab.com/pages/hugo/hugo_extended:0.65.3
    pages:
      script:
        - hugo
      artifacts:
        paths:
          - public
      only:
        - main

If `.gitlab-ci.yml` already contains a job in the `build` stage that
generates the required artifacts in the `public` directory, then
including the `pages-deploy.yml` CI template should be sufficient:

    include:
      - project: tpo/tpa/ci-templates
        file: pages-deploy.yml

GitLab pages are published under the `*.pages.torproject.org` wildcard
domain. There are two types of projects hosted at the TPO GitLab:
sub-group projects, usually under the `tpo/` super-group, and user
projects, for example `anarcat/myproject`. You can also publish a page
specifically for a user. The URLs will look something like this:

| Type of GitLab page | Name of the project created in GitLab | Website URL                                          |
|---------------------|---------------------------------------|------------------------------------------------------|
| User pages          | `username.pages.torproject.net`       | `https://username.pages.torproject.net`              |
| User projects       | `user/projectname`                    | `https://username.pages.torproject.net/projectname`  |
| Group projects      | `tpo/group/projectname`               | `https://tpo.pages.torproject.net/group/projectname` |
anarcat's avatar
anarcat committed

## Accepting merge requests on wikis

Wiki permissions are not great, but there's a workaround: accept merge
requests for a git replica of the wiki.

This documentation was [moved to the documentation section](service/documentation#accepting-merge-requests-on-wikis).
## Renaming a branch globally

While `git` supports renaming branches locally with the `git branch
--move $to_name` command, this doesn't actually rename the remote
branch. That process is more involved.
anarcat's avatar
anarcat committed
Changing the name of a default branch both locally and on remotes can
be partially automated with the use of [anarcat's branch rename
script](https://gitlab.com/anarcat/scripts/-/blob/main/git-branch-rename-remote). The script basically renames the branch locally, pushes
the new branch and deletes the old one, with special handling of
GitLab remotes, where it "un-protects" and "re-protects" the branch.
You should run the script with an account that has "Maintainer" or
"Owner" access to GitLab, so that it can do the above GitLab API
changes. You will then need to provide an [access token](https://gitlab.torproject.org/-/profile/personal_access_tokens) through
the `GITLAB_PRIVATE_TOKEN` environment variable, which should have the
scope `api`.
anarcat's avatar
anarcat committed

So, for example, this will rename the `master` branch to `main` on the
anarcat's avatar
anarcat committed
local and remote repositories:

    GITLAB_PRIVATE_TOKEN=REDACTED git-branch-rename-remote

If you want to rename another branch or remote, you can specify those
on the commandline as well. For example, this will rename the
`develop` branch to `dev` on the `gitlab` remote:

    GITLAB_PRIVATE_TOKEN=REDACTED git-branch-rename-remote --remote gitlab --from-branch develop --to-branch dev

The command can also be used to fix *other* repositories so that they
correctly rename their local branch too. In that case, the GitLab
repository is already up to date, so there is no need for an access
anarcat's avatar
anarcat committed
token.

Other users, then can just run this command will rename `master` to
anarcat's avatar
anarcat committed
`main` on the local repository, including remote tracking branches:

    git-branch-rename-remote

anarcat's avatar
anarcat committed
Obviously, users without any extra data in their local repository can
just destroy their local repository and clone a new one to get the
correct configuration.

anarcat's avatar
anarcat committed
Keep in mind that there may be a few extra steps and considerations to
make when changing the name of a heavily used branch, detailed below.
anarcat's avatar
anarcat committed
A merge request that is open against the modified branch may be
bricked as a result of deleting the old branch name from the Gitlab
remote. To avoid this, after creating and pushing the new branch name,
edit each merge request to target the new branch name **before**
deleting the old branch.
anarcat's avatar
anarcat committed
Many GitLab repositories are mirrored or maintained manually on
anarcat's avatar
anarcat committed
Gitolite (`git-rw.torproject.org`) and [Gitweb](https://gitweb.torproject.org). The `ssh` step for
the above automation script will fail for Gitolite and these steps
anarcat's avatar
anarcat committed
need to be done manually by a sysadmin. [Open a TPA ticket](https://gitlab.torproject.org/tpo/tpa/team/-/issues/new) with a
anarcat's avatar
anarcat committed
list of the Gitolite repositories you would like to update and a
anarcat's avatar
anarcat committed
sysadmin will perform the following magic:
anarcat's avatar
anarcat committed
    cd /srv/git.torproject.org/repositories/
    for repo in $list; do
        git -C "$repo" symbolic-ref HEAD refs/heads/$to_branch
    done
anarcat's avatar
anarcat committed
This will update Gitolite, but it won't update Gitweb until the
anarcat's avatar
anarcat committed
repositories have been pushed to. To update Gitweb immediately, ask
anarcat's avatar
anarcat committed
your friendly sysadmin to run the above command on the Gitweb server
as well.
anarcat's avatar
anarcat committed
If your repository relies on Transifex for translations, make sure to
update the Transifex config to pull from the new branch. To do so,
[open a l10n ticket](https://gitlab.torproject.org/tpo/community/l10n/-/issues/new?issue%5Bassignee_id%5D=&issue%5Bmilestone_id%5D=) with the new branch name changes.
## Find the project associated with a project ID

Sometimes you'll find a numeric project ID instead of a human-readable
one. For example, you can see on the [arti project](https://gitlab.torproject.org/tpo/core/arti) that it says:

    Project ID: 647

So you can easily find the project ID of a project right on the
project's front page. But what if you only have the ID and need to
find what project it represents? You can talk with the API, with a URL
like:

    https://gitlab.torproject.org/api/v4/projects/<PROJECT_ID>

For example, this is how I found the above arti project from the
`Project ID 647`:

```
$ curl -s 'https://gitlab.torproject.org/api/v4/projects/647' | jq .web_url
"https://gitlab.torproject.org/tpo/core/arti"
```

anarcat's avatar
anarcat committed
## Connect to the PostgreSQL server

The GitLab Omnibus setup is special: it ships its own embedded
PostgreSQL server (!), which means the regular `sudo -u postgres psql`
command doesn't work.

To get access to the PostgreSQL server, you need to [follow the
upstream instructions](https://docs.gitlab.com/omnibus/maintenance/#starting-a-postgresql-superuser-psql-session) which are, at the time of writing:

    sudo gitlab-psql -d gitlabhq_production

This actually expands to the following command:

    sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -p 5432 -h /var/opt/gitlab/postgresql -d gitlabhq_production -d gitlabhq_production

An emergency dump, therefore, could be taken with:

    cd /tmp ; sudo -u gitlab-psql /opt/gitlab/embedded/bin/pg_dump -p 5432 -h /var/opt/gitlab/postgresql -d gitlabhq_production -d gitlabhq_production | pv -s 2G > /srv/gitlab-backup/db/db-$(date '+%Y-%m-%d').sql

Yes, that is silly. See also [issue 20][].

## Pager playbook

<!-- information about common errors from the monitoring system and -->
<!-- how to deal with them. this should be easy to follow: think of -->
<!-- your future self, in a stressful situation, tired and hungry. -->

 * Grafana Dashboards:
   * [GitLab overview](https://grafana.torproject.org/d/QrDJktiMz/gitlab-omnibus)
   * [Gitaly](https://grafana.torproject.org/d/x6Z50y-iz/gitlab-gitaly)

TODO: document how to handle common problems in GitLab

### Troubleshooting

Upstream recommends running this command to self-test a GitLab
instance:

    sudo gitlab-rake gitlab:check SANITIZE=true

This command also shows general info about the GitLab instance:

    sudo gitlab-rake gitlab:check SANITIZE=true

it is especially useful to find on-disk files and package versions.

### GitLab pages not found

If you're looking for a way to track GitLab pages error, know that the
webserver logs are in `/var/log/nginx/gitlab_pages_access`, but that
only proxies requests for the GitLab Pages engine, which (JSON!) logs
live in `/var/log/gitlab/gitlab-pages/current`.

If you get a `"error":"domain does not exist"` problem, make sure the
entire *pipeline* actually succeeds. Typically, the "pages:deploy" job
can fail with:

    Artifacts for pages are too large

In that case, you need to go into the Admin Area -> Settings ->
Preferences -> Pages and bump the size limit. It defaults to 100MB and
we bumped it to 1024MB at the time of writing. Note that GitLab CI/CD
also have a similar setting which might (or might not?) affect such
problems.

anarcat's avatar
anarcat committed
### PostgreSQL debugging

The PostgreSQL configuration in GitLab is [particular][issue 20]. See the
[connect to the PostgreSQL server](#connect-to-the-postgresql-server) section above on how to connect
to it.

### Disk full on GitLab server

If the main GitLab server is running out of space (as opposed to
runners, see [Runner disk fills up](service/ci#runner-disk-fills-up) for that scenario), then it's
projects that are taking up space. We've typically had trouble with
artifacts taking up space, for example (tpo/tpa/team#40615,
tpo/tpa/team#40517).

You can see the largest disk users in the GitLab admin area in
[Overview -> Projects -> Sort by: Largest repository](https://gitlab.torproject.org/admin/projects?sort=storage_size_desc). 

Note that, although it's unlikely, it's technically possible that an
archived project takes up space, so make sure you check the "Show
archived projects" option in the "Sort by" drop down.

In the past, we have worked around that problem by reducing the
default artifact retention period from 4 to 2 weeks
(tpo/tpa/team#40516) but obviously does not take effect
immediately. More recently, we have tried to tweak individual
project's retention policies and scheduling strategies (details in
tpo/tpa/team#40615).

Please be aware of the [known upstream issues](#issues) that affect those
To obtain a list of project sorted by space usage, log on to GitLab using an
account with administrative privileges and open the [Projects page](https://gitlab.torproject.org/admin/projects?sort=storage_size_desc)
sorted by `Largest repository`. The total space consumed by each project is
displayed and clicking on a specific project shows a breakdown of how this space
is consumed by different components of the project (repository, LFS, CI
artifacts, etc.).

If a project is consuming an unexpected amount of space for artifacts, the
scripts from the [tpo/tpa/gitlab-tools](https://gitlab.torproject.org/tpo/tpa/gitlab-tools)
project can by utilized to obtain a breakdown of the space used by job logs and
artifacts, per job or per pipeline. These scripts can also be used to manually
remove such data, see the [gitlab-tools README](https://gitlab.torproject.org/tpo/tpa/gitlab-tools/README.md).

It's also possible to compile some CI artifact usage statistics directly on the
GitLab server. To see if expiration policies work (or if "kept" artifacts or
old `job.log` are a problem), use this command (which takes a while to
run):

    find -mtime +14 -print0 | du --files0-from=- -c -h | tee find-mtime+14-du.log

To limit this to `job.log`, of course, you can do:

    find -name "job.log" -mtime +14 -print0 | du --files0-from=- -c -h | tee find-mtime+14-joblog-du.log

### Incoming email routing

Incoming email get routed through either eugeni or the submission
service, then end up on the Postfix server on `gitlab-02`, and from
there, to a dovecot mailbox. You can use `postfix-trace` to confirm
the message correctly ended up there.

Normally, GitLab should be picking mails from the mailbox
(`/srv/mail/git@gitlab.torproject.org/Maildir/`) regularly, and
deleting them when done. If that is not happening, look at the
mailroom logs:

    tail -f /var/log/gitlab/mailroom/mail_room_json.log | jq -c

A working run will look something like this:

```
{"severity":"INFO","time":"2022-08-29T20:15:57.734+00:00","context":{"email":"git@gitlab.torproject.org","name":"inbox"},"action":"Processing started"}
{"severity":"INFO","time":"2022-08-29T20:15:57.734+00:00","context":{"email":"git@gitlab.torproject.org","name":"inbox"},"uid":7788,"action":"asking arbiter to deliver","arbitrator":"MailRoom::Arbitration::Redis"}.734+00:00","context":{"email":"git@gitlab.torproject.org","name":"inbox"},"action":"Getting new messages","unread":{"count":1,"ids":[7788]},"to_be_delivered":{"count":1,"ids":[7788]}}ext":{"email":"git@gitlab.torproject.org","name":"inbox"},"uid":7788,"action":"sending to deliverer","deliverer":"MailRoom::Delivery::Sidekiq","byte_size":4162}","delivery_method":"Sidekiq","action":"message pushed"}
{"severity":"INFO","time":"2022-08-29T20:15:57.744+00:00","context":{"email":"git@gitlab.torproject.org","name":"inbox"},"action":"Processing started"}
{"severity":"INFO","time":"2022-08-29T20:15:57.744+00:00","context":{"email":"git@gitlab.torproject.org","name":"inbox"},"action":"Getting new messages","unread":{"count":0,"ids":[]},"to_be_delivered":{"count":0,"ids":[]}}0","context":{"email":"git@gitlab.torproject.org","name":"inbox"},"action":"Idling"}
```

Emails should be processed every minute or so.

### Outgoing email

Follow the [email not sent](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/doc/settings/smtp.md#email-not-sent) procedure. TL;DR:

    sudo gitlab-rails console

(Yes it takes forever.) Then check if the settings are sane:

```
--------------------------------------------------------------------------------
 Ruby:         ruby 3.0.5p211 (2022-11-24 revision ba5cf0f7c5) [x86_64-linux]
 GitLab:       15.10.0 (496a1d765be) FOSS
 GitLab Shell: 14.18.0
 PostgreSQL:   12.12
------------------------------------------------------------[ booted in 28.31s ]
Loading production environment (Rails 6.1.7.2)
irb(main):003:0> ActionMailer::Base.delivery_method
=> :smtp
irb(main):004:0> ActionMailer::Base.smtp_settings
=> 
{:user_name=>nil,
 :password=>nil,
 :address=>"localhost",
 :port=>25,
 :domain=>"localhost",
 :enable_starttls_auto=>false,
 :tls=>false,
 :ssl=>false,
 :openssl_verify_mode=>"none",
 :ca_file=>"/opt/gitlab/embedded/ssl/certs/cacert.pem"}
```

Then test an email delivery:

    Notify.test_email('noreply@torproject.org', 'Hello World', 'This is a test message').deliver_now

A working delivery will look something like this, with the last line
in *green*:

```
irb(main):001:0> Notify.test_email('noreply@torproject.org', 'Hello World', 'This is a test message').deliver_now
Delivered mail 64219bdb6e919_10e66548d042948@gitlab-02.mail (20.1ms)
=> #<Mail::Message:296420, Multipart: false, Headers: <Date: Mon, 27 Mar 2023 13:36:27 +0000>, <From: GitLab <git@gitlab.torproject.org>>, <Reply-To: GitLab <noreply@torproject.org>>, <To: noreply@torproject.org>, <Message-ID: <64219bdb6e919_10e66548d042948@gitlab-02.mail>>, <Subject: Hello World>, <Mime-Version: 1.0>, <Content-Type: text/html; charset=UTF-8>, <Content-Transfer-Encoding: 7bit>, <Auto-Submitted: auto-generated>, <X-Auto-Response-Suppress: All>>
```

A *failed* delivery will *also* say `Delivered mail` *but* will
include an error message as well. For example, in [issue 139][] we had
this error:

```
irb(main):006:0> Notify.test_email('noreply@torproject.org', 'Hello World', 'This is a test message').deliver_now
Delivered mail 641c797273ba1_86be948d03829@gitlab-02.mail (7.2ms)
/opt/gitlab/embedded/lib/ruby/gems/3.0.0/gems/net-protocol-0.1.3/lib/net/protocol.rb:46:in `connect_nonblock': SSL_connect returned=1 errno=0 state=error: certificate verify failed (self signed certificate in certificate chain) (OpenSSL::SSL::SSLError)
```

[issue 139]: https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/139

In case the entire GitLab machine is destroyed, a new server should be
anarcat's avatar
anarcat committed
provisionned in the [howto/ganeti](howto/ganeti) cluster (or elsewhere) and backups
should be restored using the below procedure.

### Running an emergency backup

A full backup can be ran as root with:

    /usr/bin/gitlab-rake gitlab:backup:create

Backups are stored as a tar file in `/srv/gitlab-backup` and do *not*
include secrets, which are backed up separately, for example with:

    umask 0077 && tar -C /var/opt/gitlab -czf /srv/gitlab-backup/config_backup$(date +"\%Y\%m\%dT\%H\%M").tar.gz

See `/etc/cron.d/gitlab-config-backup`, and the `gitlab::backup` and
`profile::gitlab::app` classes for the actual jobs that runs nightly.
### Recovering this wiki from backups
If you need to immediately restore the wiki from backups, you can head
to the backup server and restore the directory:
    /var/opt/gitlab/git-data/repositories/@hashed/11/f8/11f8e31ccbdbb7d91589ecf40713d3a8a5d17a7ec0cebf641f975af50a1eba8d.git
The hash above is the SHA256 checksum of the [wiki-replica](https://gitlab.torproject.org/tpo/tpa/wiki-replica/)
project id (695):
    $ printf 695 | sha256sum 
    11f8e31ccbdbb7d91589ecf40713d3a8a5d17a7ec0cebf641f975af50a1eba8d  -

On the backup server, that would be something like:

    bconsole
    restore
    5
    46
    cd /var/opt/gitlab/git-data/repositories/@hashed/11/f8
    mark 11f8e31ccbdbb7d91589ecf40713d3a8a5d17a7ec0cebf641f975af50a1eba8d.git
    done
    yes

The files will end up in `/var/tmp/bacula-restore` on
`gitlab-02`. Note that the number `46`, above, will vary according to
other servers backed up on the backup server, of course.

This should give you a copy of the git repository, which you can then
use, presumably, to read this procedure and restore the rest of
GitLab. 

(Although then, how did you read *this* part of the procedure?
Anyways, I thought this could save your future self one day. You'll
thank me later.)

### Restoring from backups

The [upstream documentation](https://docs.gitlab.com/ee/raketasks/backup_restore.html#restore-for-omnibus-gitlab-installations) has a fairly good restore procedure,
but because our backup procedure is non-standard -- we exclude
repositories and artifacts, for example -- you should follow this
procedure instead.

Note that the procedure assumes some familiarity with the general
[backup and restore procedures](howto/backup), particularly how to restore a
bunch of files from the backup server (see the [restore files
section](howto/backup#restore-files).

This entire procedure will take many hours to complete. In our tests,
it took:

 1. an hour or two to setup a VM
 2. less than an hour to do a basic GitLab install
 3. 20 minutes to restore the basic system (database, tickets are
    visible at this point)
 4. an hour to restore repositories
 5. another hour to restore artifacts
This gives a time to recovery of about 5 to 6 hours. Most of that time
is spent waiting for files to be copied, interspersed with a few
manual commands.

So here's the procedure that was followed to deploy a development
server, from backups, in [tpo/tpa/team#40820](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40820) (run everything as
root):

 1. [install GitLab using Puppet](#installation): basically create a server large
    enough for everything, apply the Puppet `role::gitlab`
    
    That includes creating new certificates and DNS records, if not
    already present (those may be different if you are created a dev
    server from backups, for example, which was the case for the the
    above ticket).
    
    Also note that you need to install the *same* GitLab version as
    the one from the backup. If you are unsure of the GitLab version
    that's in the backup (bad day uh?), try to restore the
    `/var/opt/gitlab/gitlab-rails/VERSION` file from the backup server
    first.

 2. at this point, a blank GitLab installation should be
    running. verify that you can reach the login page, possibly trying
    to login with the root account, because a working GitLab
    installation is a pre-requisite for the rest of the restore
    procedure.
    
    (it might be technically possible to restore the entire server
    from scratch using only the backup server, but that procedure has
    not been established or tested.)

 3. on the backup server (currently `bacula-director-01`), restore the
    latest GitLab backup job from the `/srv/gitlab-backup` and the
    secrets from `/etc/gitlab`:
    
        # bconsole
        *restore
        To select the JobIds, you have the following choices:
        [...]
         5: Select the most recent backup for a client
        [...]
        Select item:  (1-13): 5
        Defined Clients:
        [...]
            47: gitlab-02.torproject.org-fd
        [...]
        Select the Client (1-98): 47
        Automatically selected FileSet: Standard Set
        [...]
        Building directory tree for JobId(s) 199535,199637,199738,199847,199951 ...  ++++++++++++++++++++++++++++++++
        596,949 files inserted into the tree.
        [...]
        cwd is: /
        $ cd /etc
        cwd is: /etc/
        $ mark gitlab
        84 files marked.
        $ cd /srv
        cwd is: /srv/
        $ mark gitlab-backup
        12 files marked.
        $ done

    This took about 20 minutes in a simulation done in June 2022,
    including 5 minutes to load the file list.

 4. move the files in place and fix ownership, possibly moving
    pre-existing backups out of place, if the new server has been
    running for more than 24 hours:
    
        mkdir /srv/gitlab-backup.blank
        mv /srv/gitlab-backup/* /srv/gitlab-backup.blank
        cd /var/tmp/bacula-restores/srv/gitlab-backup
        mv *.tar.gz backup_information.yml  db /srv/gitlab-backup/
        cd /srv/gitlab-backup/
        chown git:git *.tar.gz backup_information.yml db/ db/database.sql.gz 

 5. stop GitLab services that talk with the database (those might have
    changed since the time of writing, [review upstream documentation
    just in case](https://docs.gitlab.com/ee/raketasks/backup_restore.html#restore-for-omnibus-gitlab-installations):
    
        gitlab-ctl stop puma
        gitlab-ctl stop sidekiq

 6. restore the secrets files (note: this wasn't actually tested, but
    should work):
 
        chown root:root /var/tmp/bacula-restores/etc/gitlab/*
        mv /var/tmp/bacula-restores/etc/gitlab/{gitlab-secrets.json,gitlab.rb} /etc/gitlab/

    Note that if you're setting up a development environment, you do
    *not* want to perform that step, which means that CI/CD variables
    and 2FA tokens will be lost, which means people will need to reset
    those and login with their recovery codes. This is what you want
    for a dev server, because you do not want a possible dev server
    compromise to escalate to the production server, or the dev server
    to have access to the prod deployments.
    
    Also note that this step was *not* performed on the dev server
    test and this lead to problems during login: while it was possible
    to use a recovery code to bypass 2FA, it wasn't possible to reset
    the 2FA configuration afterwards.
        gitlab-backup restore
    This last step will ask you to confirm the restore, because it
    actually destroys the existing install. It will also ask you to
    confirm the rewrite of the `authorized_keys` file, which you want
    to accept (unless you specifically want to restore that from
    backup as well).

 7. restart the services and check everything:
 
        gitlab-ctl reconfigure
        gitlab-ctl restart
        gitlab-rake gitlab:check SANITIZE=true
        gitlab-rake gitlab:doctor:secrets
        gitlab-rake gitlab:lfs:check
        gitlab-rake gitlab:uploads:check
        gitlab-rake gitlab:artifacts:check

    Note: in the simulation, GitLab was started like this instead,
    which just worked as well:

        gitlab-ctl start puma
        gitlab-ctl start sidekiq

    We did try the "verification" tasks above, but many of them
    failed, especially in the `gitlab:doctor:secrets` job, possibly
    because we didn't restore the secrets (deliberately).

At this point, basic functionality like logging-in and issues should
be working again, but not wikis (because they are not restored
yet). Note that it's **normal** to see a 502 error message ("Whoops,
GitLab is taking too much time to respond.") when GitLab restarts: it
takes a *long* time to start (think minutes)... You can follow its
progress in `/var/log/gitlab/gitlab-rails/*.log`.
Be warned that the new server *will* start sending email
notifications, for example for issues with an due date, which might be
confusing for users if this is a development server. If this is a
production server, that's a good thing. If it's a development server,
you may want to disable email altogether in the GitLab server, with
this line in Hiera data (eg. `hiera/roles/gitlab_dev.yml`) in the
`tor-puppet.git` repository:

    profile::gitlab::app::email_enabled: false

So the above procedure only restores a *part* of the system, namely
what is covered by the nightly backup job. To restore the rest (at the
time of writing: artifacts and repositories, which includes wikis!),
you also need to specifically restore those files from the backup
server.

For example, this procedure will restore the repositories from the
backup server:

        $ cd /var/opt/gitlab/git-data
        cwd is: /var/opt/gitlab
        $ mark repositories
        113,766 files marked.
        $ done

The files will then end up in
`/var/tmp/bacula-restores/var/opt/gitlab/git-data`. They will need to
be given to the right users and moved into place:

    chown -R git:root /var/tmp/bacula-restores/var/opt/gitlab/git-data/repositories
    mv /var/opt/gitlab/git-data/repositories /var/opt/gitlab/git-data/repositories.orig
    mv /var/tmp/bacula-restores/var/opt/gitlab/git-data/repositories /var/opt/gitlab/git-data/repositories/

During the last simulation, restoring repositories took an hour.

Restoring artifacts is similar:

    $ cd /srv/gitlab-shared
    cwd is: /srv/gitlab-shared/
    $ mark artifacts
    434,788 files marked.
    $ done

Then the files need to be given and moved as well, notice the
`git:git` instead of `git:root`:

    chown -R git:git /var/tmp/bacula-restores/srv/gitlab-shared/artifacts
    mv /var/opt/gitlab/gitlab-rails/shared/artifacts/ /var/opt/gitlab/gitlab-rails/shared/artifacts.orig
    mv /var/tmp/bacula-restores/srv/gitlab-shared/artifacts /var/opt/gitlab/gitlab-rails/shared/artifacts/

Restoring the artifacts took another hour of copying.

And that's it! Note that this procedure may vary if the subset of
files backed up by the GitLab backup job changes.
anarcat's avatar
anarcat committed
### Main GitLab installation

anarcat's avatar
anarcat committed
The current GitLab server was setup in the [howto/ganeti](howto/ganeti) cluster in a
regular virtual machine. It was configured with [howto/puppet](howto/puppet) with the
`roles::gitlab`. That, in turn, includes a series of `profile`
classes which configure:
 * `profile::gitlab::web`: nginx vhost and TLS cert, which depends on
anarcat's avatar
anarcat committed
   `profile::nginx` built for the [howto/cache](howto/cache) service and relying on the
   [puppet/nginx](https://forge.puppet.com/puppet/nginx) module from the Forge
 * `profile::gitlab::app`: the core of the configuration of gitlab
   itself, uses the [puppet/gitlab](https://forge.puppet.com/puppet/gitlab) module from the Forge, with
   Prometheus, Grafana, and Nginx support disabled, but Redis,
   PostgreSQL, and other exporters enabled
 * `profile::dovecot::private`: a simple IMAP server to receive mails
   destined to GitLab