Skip to content
Snippets Groups Projects
gitlab.md 56.8 KiB
Newer Older
[GitLab](https://gitlab.com/) is *a web-based DevOps lifecycle tool that provides a
Git-repository manager providing wiki, issue-tracking and continuous
integration/continuous deployment pipeline features, using an
open-source license, developed by GitLab Inc* ([Wikipedia](https://en.wikipedia.org/wiki/GitLab)). Tor
uses GitLab mainly for issue tracking, wiki hosting and code review
for now, at <https://gitlab.torproject.org>, after migrating from
anarcat's avatar
anarcat committed
[howto/trac](howto/trac).
Note that continuous integration is documented separately, in [the CI page](service/ci).
[[_TOC_]]

# Tutorial

<!-- simple, brainless step-by-step instructions requiring little or -->
<!-- no technical background -->

## How to get an account?
You might already *have* an account! If you were active on Trac, your
account was migrated with the same username and email address as Trac,
unless you have an LDAP account, in which case that was used. So head
over to the [password reset page](https://gitlab.torproject.org/users/password/new) to get access to your account.

If your account was *not* migrated, send a mail to
<gitlab-admin@torproject.org> to request a new one.

If you did not have an account in Trac and want a new account, you
should request a new one at <https://gitlab.onionize.space/>.
## How to report an issue in Tor software?
You first need to figure out which project the issue resides in. The
[project list][] is a good place to get started. Here are a few quick
links for popular projects:
[project list]: https://gitlab.torproject.org/tpo
 * [core tor](https://gitlab.torproject.org/tpo/core/tor): [issues](https://gitlab.torproject.org/tpo/core/tor/-/issues), [new issue](https://gitlab.torproject.org/tpo/core/tor/-/issues/new)
 * [Tor Browser](https://gitlab.torproject.org/tpo/applications/tor-browser): [issues](https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues), [new issue](https://gitlab.torproject.org/tpo/applications/tor-browser/-/issues/new)
 * [gitlab](https://gitlab.torproject.org/tpo/tpa/gitlab): [issues](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues), [new issue](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/new)
If you do not have a GitLab account or can't figure it out for any
reason, you can also use the mailing lists. The
<tor-dev@lists.torproject.org> mailing list is the best for now.
## How to report an issue in the bugtracker itself?

If you have access to GitLab, you can [file a new issue][File] after
you have [searched the GitLab project for similar bugs][search]. 

If you do *not* have access to GitLab, you can email
<gitlab-admin@torproject.org>.

### Note about confidential issues

Note that you can mark issues as "confidentials" which will make them
private to the members of the project the issue is reported on (the
"developers" group and above, specifically).

Keep in mind, however, that it is still possible issue information
gets leaked in cleartext, however. For example, GitLab [sends email
notifications in cleartext for private issue](https://gitlab.com/gitlab-org/gitlab/-/issues/5816), an known upstream
issue. (We have [decided we cannot fix this ourselves in GitLab for
now](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/23).) Some repositories might also have "web hooks" that notify
IRC bots in clear text as well, although at the time of writing all
projects are correctly configured.

## How to contribute code?

As reporting an issue, you first need to figure out which project you
are working on in the GitLab [project list][]. Then, if you are not
familiar with merge requests, you should read the [merge requests
introduction](https://gitlab.torproject.org/help/user/project/merge_requests/getting_started.md) in the GitLab documentation. If you are unfamiliar
with merge requests but familiar with GitHub's pull requests, those
are similar.

anarcat's avatar
anarcat committed
Note that we do not necessarily use merge requests in all teams yet,
and Gitolite still has the canonical version of the code. See [issue
36][] for a followup on this.

[issue 36]: https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/36

Also note that different teams might have different workflows. If a
team has a special workflow that diverges from the one here, it should
be documented here. Those are the workflows we know about:

 * [Network Team](https://gitlab.torproject.org/tpo/core/tor/-/wikis/NetworkTeam/GitlabReviews)
anarcat's avatar
anarcat committed
 * [Web Team](https://gitlab.torproject.org/tpo/web/community/-/wikis/Git-flow-and-merge-requests)
 * Bridge DB: merge requests
If you do not have access to GitLab, please use one of the mailing
lists: <tor-dev@lists.torproject.org> would be best.

## How to quote a comment in a reply?

The "Reply" button only creates a new comment without any quoted text
by default.  It seems the solution to that is currently highlighting
the text to quote and then pressing the `r`-key. See also the [other
keyboard shortcuts](https://docs.gitlab.com/ee/user/shortcuts.html).

Alternatively, you can copy-paste the text in question in the comment
form, select the pasted text, and hit the `Insert a quote` button
which look like a styled, curly, and closing quotation mark `”`.

anarcat's avatar
anarcat committed
## Continuous Integration (CI)

All CI documentation resides in a different document see
[service/ci](service/ci).

anarcat's avatar
anarcat committed
You can interact with GitLab by email too. 

anarcat's avatar
anarcat committed
Clicking on the project issues gives a link at the bottom of the page,
which says say "Email a new issue to this project".

That link should go into the "To" field of your email. The email
subject becomes the title of the issue and the body the
description. You can use shortcuts in the body, like `/assign @foo`,
`/estimate 1d`, etc.

See [the upstream docs for more details](https://docs.gitlab.com/ee/user/project/issues/managing_issues.html#new-issue-via-url-with-prefilled-fields).

### Commenting on an issue
anarcat's avatar
anarcat committed

If you just reply to the particular comment notification you received
by email, as you would reply to an email in a thread, that comment
will show up in the issue.

You need to have email notifications enabled for this to work,
naturally.
You can also add a new comment to any issue by copy-pasting the
issue-specific email address in the right sidebar (labeled "Issue
email", [introduced in GitLab 13.8](https://gitlab.com/gitlab-org/gitlab/-/issues/18816)).
anarcat's avatar
anarcat committed
This also works with shortcuts like `/estimate 1d` or `/spend
-1h`. Note: for those you won't get notification emails back, though,
while for others like `/assign @foo` you would.
See [the upstream docs for more details](https://docs.gitlab.com/ee/administration/reply_by_email.html).

anarcat's avatar
anarcat committed
### Quick status updates by email
anarcat's avatar
anarcat committed
There are a bunch of [quick actions](https://gitlab.torproject.org/help/user/project/quick_actions.md) available which are handy to
update an issue. As mentioned above they can be sent by email as well,
both within a comment (be it as a reply to a previous one or in a new
one) or just instead of it. So, for example, if you want to update the
amount of time spent on ticket $foo by one hour, find any notification
email for that issue and reply to it by replacing any quoted text with
`/spend 1h`.
anarcat's avatar
anarcat committed
## How to migrate a Git repository from legacy to GitLab?

Important: this policy is still being debated. It is not clear if any
or all repositories should be migrated to GitLab, see [issue 36][]
for the discussion on this topic.

As an example of a repository migration, I have moved the wiki from
gitolite to gitlab just now. I have followed the following procedure:

 1. create a project on gitlab (in `tpo/tpa/wiki-archive` in my case)

 2. push (manually) the latest git references present on `git-rw` to
    gitlab (`git push --mirror`...)

 3. if the repository is to be archived on GitLab, make it so in
    `Settings` -> `General` -> `Advanced` -> `Archive project`

 4. make an (executable) `pre-receive` hook in `git-rw` with an exit
    status of `1` warning about the new code location, example:

        $ cat /srv/git.torproject.org/repositories/project/help/wiki.git/hooks/pre-receive 
        #!/bin/sh

        cat <<EOF
        This repository has been migrated to GitLab:

        https://gitlab.torproject.org/tpo/tpa/services/-/wikis/home

        Update your remotes to:

            git@gitlab.torproject.org:tpo/tpa/services.wiki.git

        or:

            https://gitlab.torproject.org/tpo/tpa/services.wiki.git

        See this issue for details:

        https://gitlab.torproject.org/tpo/tpa/services/-/issues/34437
        EOF

        exit 1

    or in the case of a fully archived repository (non-writable):
    
        $ cat /srv/git.torproject.org/repositories/project/help/infra.git/hooks/pre-receive 
        #!/bin/sh

        cat <<EOF
        This repository has been migrated to GitLab:

        https://gitlab.torproject.org/tpo/tpa/wiki-infra-archive

        We have migrated away from ikiwiki so it is not necessary anymore.

        See this issue for details:

        https://gitlab.torproject.org/tpo/tpa/services/-/issues/34437
        EOF

        exit 1

 4. in Gitolite, make the project part of the "Attic", for example

        @@ -328,13 +328,13 @@ admin/trac/TracAccountManager "The Tor Project" = "Tor specific changes to Matth
         
         repo project/help/infra
             RW+                                      = @torproject-admin
        -    config gitweb.category                   = Infrastructure and Administration
        -project/help/infra "The Tor Project" = "help.torproject.org infrastructure"
        +    config gitweb.category                   = Attic
        +project/help/infra "The Tor Project" = "help.torproject.org infrastructure (archived to GitLab: https://gitlab.torproject.org/tpo/tpa/wiki-infra-archive')"
         
         repo project/help/wiki
             RW                                       = anarcat
        -    config gitweb.category                   = Infrastructure and Administration
        -project/help/wiki "The Tor Project" = "help.torproject.org content"
        +    config gitweb.category                   = Attic
        +project/help/wiki "The Tor Project" = "help.torproject.org content (archived to GitLab: https://gitlab.torproject.org/tpo/tpa/wiki-archive')"
         
         repo project/jenkins/jobs
             RW                                       = @jenkins-admins

The only downside with that approach is that a git clone will not warn
about the project redirection, but I am not sure there's a way to fix
that.

See [issue 36][] for further discussion.

anarcat's avatar
anarcat committed
## How to find the right emoji?

It's possible to add "reaction emojis" to comments and issues and
merge requests in GitLab. Just hit the little smiley face and a dialog
will pop up. You can then browse through the list and pick the right
emoji for how you feel about the comment, but remember to be nice!

It's possible you get lost in the list. You can type the name of the
emoji to restrict your search, but be warned that some emojis have
[particular, non-standard names](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/29057) that might not be immediately
obvious. For example, `🎉`, `U+1F389 PARTY POPPER`, is found as
`tada` in the list! See [this upstream issue for more details](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/29057).

anarcat's avatar
anarcat committed
## Hooking up a project with the bots

By default, new projects do *not* have notifications setup in
`#tor-bots` like all the others. To do this, you need to configure a
"Webhook", in the `Settings -> Webhooks` section of the project. The
URL should be:

anarcat's avatar
anarcat committed
    https://kgb-bot.torproject.org/webhook/
anarcat's avatar
anarcat committed

... and you should select the notifications you wish to see in
`#tor-bots`. You can also enable notifications to other channels by
anarcat's avatar
anarcat committed
adding more parameters to the URL, like (say)
`?channel=tor-foo`. Important note: do not try to put the `#` in
the channel name, or if you do, URL-encode it (e.g. like `%23tor-foo`),
otherwise this will silently fail to change the target channel. Other
parameters are documented the [KGB documentation](https://salsa.debian.org/kgb-team/kgb/-/wikis/usage).
anarcat's avatar
anarcat committed

Note that GitLab admins might be able to configure [system-wide
hooks](https://gitlab.torproject.org/help/system_hooks/system_hooks) in [the admin section](https://gitlab.torproject.org/admin/hooks), although it's not entirely clear
how does relate to the per-project hooks so those have not been
enabled. Furthermore, it is possible for GitLab admins with root
access to enable webhooks on *all* projects, with the [webhook rake
task](https://docs.gitlab.com/ee/raketasks/web_hooks.html#webhooks). For example, running this on the GitLab server (currently
`gitlab-02`) will enable the above hook on all repositories:

    sudo gitlab-rake gitlab:web_hook:add URL='https://kgb-bot.torproject.org/webhook/'
anarcat's avatar
anarcat committed

Note that by default, the rake task only enables `Push` events. You
need the following patch to enable others:

    modified   lib/tasks/gitlab/web_hook.rake
    @@ -10,7 +10,19 @@ namespace :gitlab do
           puts "Adding webhook '#{web_hook_url}' to:"
           projects.find_each(batch_size: 1000) do |project|
             print "- #{project.name} ... "
    -        web_hook = project.hooks.new(url: web_hook_url)
    +        web_hook = project.hooks.new(
    +          url: web_hook_url,
    +          push_events: true,
    +          issues_events: true,
    +          confidential_issues_events: false,
    +          merge_requests_events: true,
    +          tag_push_events: true,
    +          note_events: true,
    +          confidential_note_events: false,
    +          job_events: true,
    +          pipeline_events: true,
    +          wiki_page_events: true,
    +        )
             if web_hook.save
               puts "added".color(:green)
             else

See also the [upstream issue](https://gitlab.com/gitlab-org/gitlab/-/issues/17966) and [our GitLab issue 7](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/7) for
details.

anarcat's avatar
anarcat committed
You can also remove a given hook from all repos with:

    sudo gitlab-rake gitlab:web_hook:rm URL='https://kgb-bot.torproject.org/webhook/'

And, finally, list all hooks with:

    sudo gitlab-rake gitlab:web_hook:list

anarcat's avatar
anarcat committed
## Setting up two-factor authentication (2FA)

We strongly recommend you enable two-factor authentication on
GitLab. This is [well documented in the GitLab manual](https://gitlab.torproject.org/help/user/profile/account/two_factor_authentication.md#two-factor-authentication), but basically:

 1. first, pick a 2FA "app" (and optionally a hardware token) if you
    don't have one already

 2. head to your [account settings](https://gitlab.torproject.org/profile/account)

 3. register your 2FA app and save the recovery codes somewhere. if
    you need to enter a URL by hand, you can scan the qrcode with your
    phone or create one by following this format:

        otpauth://totp/$ACCOUNT?secret=$KEY&issuer=gitlab.torproject.org

    where...

      * `$ACCOUNT` is the `Account` field in the 2FA form
      * `$KEY` is the `Key` field in the 2FA form, without spaces

 4. register the 2FA hardware token if available

GitLab requires a 2FA "app" even if you intend to use a hardware
token. The 2FA "app" must implement the TOTP protocol, for example the
[Google Authenticator](https://play.google.com/store/apps/details?id=com.google.android.apps.authenticator2) or a free alternative (for example [free OTP
plus](https://github.com/helloworld1/FreeOTPPlus/), see also this [list from the Nextcloud project](https://github.com/nextcloud/twofactor_totp#readme)). The
hardware token must implement the U2F protocol, which is supported by
security tokens like the [YubiKey](https://en.wikipedia.org/wiki/YubiKey), [Nitrokey](https://www.nitrokey.com/), or similar.
anarcat's avatar
anarcat committed
## Deleting sensitive attachments

If a user uploaded a secret attachment by mistake, just deleting the
issue is not sufficient: it turns out that doesn't remove the
attachments from disk!

To fix this, ask a sysadmin to find the file in the
`/var/opt/gitlab/gitlab-rails/uploads/` directory. Assuming the
attachment URL is:

<https://gitlab.torproject.org/anarcat/test/uploads/7dca7746b5576f6c6ec34bb62200ba3a/openvpn_5.png>

There should be a "hashed" directory and a hashed filename in there,
which looks something like:

    ./@hashed/08/5b/085b2a38876eeddc33e3fbf612912d3d52a45c37cee95cf42cd3099d0a3fd8cb/7dca7746b5576f6c6ec34bb62200ba3a/openvpn_5.png

The second directory (`7dca7746b5576f6c6ec34bb62200ba3a` above) is the
one visible in the attachment URL. The last part is the actual
attachment filename, but since those can overlap between issues, it's
safer to look for the hash. So to find the above attachement, you
should use:

    find /var/opt/gitlab/gitlab-rails/uploads/ -name 7dca7746b5576f6c6ec34bb62200ba3a

And delete the file in there. The following should do the trick:

    find /var/opt/gitlab/gitlab-rails/uploads/ -name 7dca7746b5576f6c6ec34bb62200ba3a | sed 's/^/rm /' > delete.sh

Verify `delete.sh` and run it if happy.

Note that GitLab is working on an [attachment manager](https://gitlab.com/gitlab-org/gitlab/-/issues/16229) that should
allow web operators to delete old files, but it's unclear how or when
this will be implemented, if ever.

anarcat's avatar
anarcat committed
## Publishing GitLab pages

GitLab features a way to publish websites directly from the continuous
integration pipelines, called [GitLab pages](https://docs.gitlab.com/ee/user/project/pages/). Complete
anarcat's avatar
anarcat committed
documentation on how to publish such pages is better served by the
official documentation, but creating a `.gitlab-ci.yml` should get you
rolling. For example, this will publish a `hugo` site:

    image: registry.gitlab.com/pages/hugo/hugo_extended:0.65.3
    pages:
      script:
        - hugo
      artifacts:
        paths:
          - public
      only:
        - main

GitLab pages are published under the `*.pages.torproject.org` wildcard
domain. There are two types of projects hosted at the TPO GitLab:
sub-group projects, usually under the `tpo/` super-group, and user
projects, for example `anarcat/myproject`. You can also publish a page
specifically for a user. The URLs will look something like this:

| Type of GitLab page | Name of the project created in GitLab | Website URL                                          |
|---------------------|---------------------------------------|------------------------------------------------------|
| User pages          | `username.pages.torproject.net`       | `https://username.pages.torproject.net`              |
| User projects       | `user/projectname`                    | `https://username.pages.torproject.net/projectname`  |
| Group projects      | `tpo/group/projectname`               | `https://tpo.pages.torproject.net/group/projectname` |
anarcat's avatar
anarcat committed

## Accepting merge requests on wikis

Wiki permissions are not great, but there's a workaround: accept merge
requests for a git replica of the wiki.

This documentation was [moved to the documentation section](service/documentation#accepting-merge-requests-on-wikis).
## Renaming a branch globally

While `git` supports renaming branches locally with the `git branch
--move $to_name` command, this doesn't actually rename the remote
branch. That process is more involved.
anarcat's avatar
anarcat committed
Changing the name of a default branch both locally and on remotes can
be partially automated with the use of [anarcat's branch rename
script](https://gitlab.com/anarcat/scripts/-/blob/main/git-branch-rename-remote). The script basically renames the branch locally, pushes
the new branch and deletes the old one, with special handling of
GitLab remotes, where it "un-protects" and "re-protects" the branch.
You should run the script with an account that has "Maintainer" or
"Owner" access to GitLab, so that it can do the above GitLab API
changes. You will then need to provide an [access token](https://gitlab.torproject.org/-/profile/personal_access_tokens) through
anarcat's avatar
anarcat committed
the `GITLAB_PRIVATE_TOKEN` environment variable.

So, for example, this will rename the `master` branch to `main` on the
local and remote repos:

    GITLAB_PRIVATE_TOKEN=REDACTED git-branch-rename-remote

If you want to rename another branch or remote, you can specify those
on the commandline as well. For example, this will rename the
`develop` branch to `dev` on the `gitlab` remote:

    GITLAB_PRIVATE_TOKEN=REDACTED git-branch-rename-remote --remote gitlab --from-branch develop --to-branch dev

The command can also be used to fix *other* repositories so that they
correctly rename their local branch too. In that case, the GitLab
repository is already up to date, so there is no need for an access
anarcat's avatar
anarcat committed
token.

Other users, then can just run this command will rename `master` to
`main` on the local repo, including remote tracking branches:

    git-branch-rename-remote

anarcat's avatar
anarcat committed
Keep in mind that there may be a few extra steps and considerations to
make when changing the name of a heavily used branch, detailed below.
anarcat's avatar
anarcat committed
A merge request that is open against the modified branch may be
bricked as a result of deleting the old branch name from the Gitlab
remote. To avoid this, after creating and pushing the new branch name,
edit each merge request to target the new branch name **before**
deleting the old branch.
anarcat's avatar
anarcat committed
Many GitLab repositories are mirrored or maintained manually on
gitolite (`git-rw.torproject.org`) and [Gitweb](https://gitweb.torproject.org). The `ssh` step for
the above automation script will fail for gitolite and these steps
need to be done manually by a sysadmin. [Open a TPA ticket](https://gitlab.torproject.org/tpo/tpa/team/-/issues/new) with a
list of the gitolite repositories you would like to update and a
sysadmin will perform the following magic:
anarcat's avatar
anarcat committed
    cd /srv/git.torproject.org/repositories/
    for repo in $list; do
        git -C "$repo" symbolic-ref HEAD refs/heads/$to_branch
    done
anarcat's avatar
anarcat committed
This will update Gitolite, but it won't update Gitweb until the
repositories have been pushed to. To udpate Gitweb immediately, ask
your friendly sysadmin to run the above command on the Gitweb server
as well.
anarcat's avatar
anarcat committed
If your repository relies on Transifex for translations, make sure to
update the Transifex config to pull from the new branch. To do so,
[open a l10n ticket](https://gitlab.torproject.org/tpo/community/l10n/-/issues/new?issue%5Bassignee_id%5D=&issue%5Bmilestone_id%5D=) with the new branch name changes.
anarcat's avatar
anarcat committed
## Connect to the PostgreSQL server

The GitLab Omnibus setup is special: it ships its own embedded
PostgreSQL server (!), which means the regular `sudo -u postgres psql`
command doesn't work.

To get access to the PostgreSQL server, you need to [follow the
upstream instructions](https://docs.gitlab.com/omnibus/maintenance/#starting-a-postgresql-superuser-psql-session) which are, at the time of writing:

    sudo gitlab-psql -d gitlabhq_production

This actually expands to the following command:

    sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -p 5432 -h /var/opt/gitlab/postgresql -d gitlabhq_production -d gitlabhq_production

An emergency dump, therefore, could be taken with:

    cd /tmp ; sudo -u gitlab-psql /opt/gitlab/embedded/bin/pg_dump -p 5432 -h /var/opt/gitlab/postgresql -d gitlabhq_production -d gitlabhq_production | pv -s 2G > /srv/gitlab-backup/db/db-$(date '+%Y-%m-%d').sql

Yes, that is silly. See also [issue 20][].

## Pager playbook

<!-- information about common errors from the monitoring system and -->
<!-- how to deal with them. this should be easy to follow: think of -->
<!-- your future self, in a stressful situation, tired and hungry. -->

 * Grafana Dashboards:
   * [GitLab overview](https://grafana.torproject.org/d/QrDJktiMz/gitlab-omnibus)
   * [Gitaly](https://grafana.torproject.org/d/x6Z50y-iz/gitlab-gitaly)

TODO: document how to handle common problems in GitLab

### Troubleshooting

Upstream recommends running this command to self-test a GitLab
instance:

    sudo gitlab-rake gitlab:check SANITIZE=true

This command also shows general info about the GitLab instance:

    sudo gitlab-rake gitlab:check SANITIZE=true

it is especially useful to find on-disk files and package versions.

### GitLab pages not found

If you're looking for a way to track GitLab pages error, know that the
webserver logs are in `/var/log/nginx/gitlab_pages_access`, but that
only proxies requests for the GitLab Pages engine, which (JSON!) logs
live in `/var/log/gitlab/gitlab-pages/current`.

If you get a `"error":"domain does not exist"` problem, make sure the
entire *pipeline* actually succeeds. Typically, the "pages:deploy" job
can fail with:

    Artifacts for pages are too large

In that case, you need to go into the Admin Area -> Settings ->
Preferences -> Pages and bump the size limit. It defaults to 100MB and
we bumped it to 1024MB at the time of writing. Note that GitLab CI/CD
also have a similar setting which might (or might not?) affect such
problems.

anarcat's avatar
anarcat committed
### PostgreSQL debugging

The PostgreSQL configuration in GitLab is [particular][issue 20]. See the
[connect to the PostgreSQL server](#connect-to-the-postgresql-server) section above on how to connect
to it.

In case the entire GitLab machine is destroyed, a new server should be
anarcat's avatar
anarcat committed
provisionned in the [howto/ganeti](howto/ganeti) cluster (or elsewhere) and backups
should be restored using the below procedure.

### Running an emergency backup

A full backup can be ran as root with:

    /usr/bin/gitlab-rake gitlab:backup:create

Backups are stored as a tar file in `/srv/gitlab-backup` and do *not*
include secrets, which are backed up separately, for example with:

    umask 0077 && tar -C /var/opt/gitlab -czf /srv/gitlab-backup/config_backup$(date +"\%Y\%m\%dT\%H\%M").tar.gz

See `/etc/cron.d/gitlab-config-backup`, and the `gitlab::backup` and
`profile::gitlab::app` classes for the actual jobs that runs nightly.
Untested procedure extracted from the [upstream docs](https://docs.gitlab.com/ee/raketasks/backup_restore.html#restore-for-omnibus-gitlab-installations):

 1. reinstall the same version you are restoring from

 2. restore the secrets backup:
 
        tar -C /opt/gitlab/etc/ -x -v -z -f config_backup20200627T0200.tar.gz

 2. configure and start gitlab:
 
        sudo gitlab-ctl reconfigure
        sudo gitlab-ctl start

 3. restore the file:

        sudo cp 11493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar /var/opt/gitlab/backups
        sudo chown git.git /var/opt/gitlab/backups/11493107454_2018_04_25_10.6.4-ce_gitlab_backup.tar
        sudo gitlab-ctl stop unicorn
        sudo gitlab-ctl stop puma
        sudo gitlab-ctl stop sidekiq
        # Verify
        sudo gitlab-ctl status
        sudo gitlab-backup restore BACKUP=11493107454_2018_04_25_10.6.4-ce
anarcat's avatar
anarcat committed
### Main GitLab installation

anarcat's avatar
anarcat committed
The current GitLab server was setup in the [howto/ganeti](howto/ganeti) cluster in a
regular virtual machine. It was configured with [howto/puppet](howto/puppet) with the
`roles::gitlab`. That, in turn, relies on a series of `profile`
elements which configure:

 * `profile::gitlab::web`: nginx vhost and TLS cert, depends on
anarcat's avatar
anarcat committed
   `profile::nginx` built for the [howto/cache](howto/cache) service and relying on the
   [puppet/nginx](https://forge.puppet.com/puppet/nginx) module from the Forge
 * `profile::gitlab::mail`: dovecot and postfix configuration, for
   email replies
 * `profile::gitlab::database`: postgresql configuration, possibly not
anarcat's avatar
anarcat committed
   used by the Omnibus package, see [issue 20][]
 * `profile::gitlab::app`: the core of the configuration of gitlab
   itself, uses the [puppet/gitlab](https://forge.puppet.com/puppet/gitlab) module from the Forge, with
   Prometheus, Grafana, and Nginx support disabled, but Redis,
   PostgreSQL, and Prometheus exporters enabled
anarcat's avatar
anarcat committed
[issue 20]: https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/20

This installs the [GitLab Omnibus](https://docs.gitlab.com/omnibus/) distribution which duplicates a
lot of resources we would otherwise manage elsewhere in Puppet,
including (but possibly not limited to):

 * [howto/prometheus](howto/prometheus) exporters (see [issue 40077](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40077) for example)
anarcat's avatar
anarcat committed
 * [howto/postgresql](howto/postgresql)

This therefore leads to a "particular" situation regarding monitoring
and PostgreSQL backups, in particular. See [issue 20](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/20) for details.
Note that the first gitlab server (gitlab-01) was setup using the
Ansible recipes used by the Debian.org project. That install was not
working so well (e.g. [503 errors on merge requests](https://gitlab.torproject.org/tpo/tpa/team/-/issues/32197)) so we
[migrated to the omnibus package](https://gitlab.torproject.org/tpo/tpa/team/-/issues/32949) in March 2020, which seems to
work better.

anarcat's avatar
anarcat committed
### GitLab CI installation

See [the CI documentation](service/ci) for documentation specific to GitLab CI.

### GitLab pages installation

To setup GitLab pages, we [followed the GitLab Pages administration
manual](https://docs.gitlab.com/ee/administration/pages). The steps taken were as follows:

 1. add `pages.torproject.net` to the [public suffix list](https://publicsuffix.org/) ([issue
    40121](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40121) and [upstream PR](https://github.com/publicsuffix/list/pull/1196)) (although that takes months or
    *years* to propagate everywhere)
 1. add `*.pages.torproject.net` and `pages.torproject.net` to DNS
    (`dns/domains.git` repository), as A records so that LE DNS-01
    challenges still work, along with a CAA record to allow the
    wildcard on `pages.torproject.net`
 2. get the wildcard cert from Let's Encrypt (in
    `letsencrypt-domains.git`)
 3. deploy the TLS certificate, some GitLab config and a nginx vhost to gitlab-02
    with Puppet
 4. run the [status-site pipeline](https://gitlab.torproject.org/tpo/tpa/status-site/-/pipelines) to regenerate the pages

The GitLab pages configuration lives in the `profile::gitlab::app`
Puppet class. The following GitLab settings were added:

    gitlab_pages             => {
      ssl_certificate     => '/etc/ssl/torproject/certs/pages.torproject.net.crt-chained',
      ssl_certificate_key => '/etc/ssl/private/pages.torproject.net.key',
    },
    pages_external_url       => 'https://pages.torproject.net',

The virtual host for the `pages.torproject.org` domain was configured
through the `profile::gitlab::web` class.

<!-- this describes an acceptable level of service for this service -->

## Migration from Trac
GitLab was put online as part of a migration from [Trac](howto/trac),
see the [Trac documentation for details on the migration](howto/trac#gitlab-migration).
anarcat's avatar
anarcat committed

<!-- how this is built -->
<!-- should reuse and expand on the "proposed solution", it's a -->
<!-- "as-built" documented, whereas the "Proposed solution" is an -->
<!-- "architectural" document, which the final result might differ -->
<!-- from, sometimes significantly -->

<!-- a good guide to "audit" an existing project's design: -->
<!-- https://bluesock.org/~willkg/blog/dev/auditing_projects.html -->

GitLab is a fairly large program with multiple components. The
[upstream documentation](https://docs.gitlab.com/ee/development/architecture.html) has a good details of the architecture but
this section aims at providing a shorter summary. Here's an overview
diagram, first:

![GitLab's architecture diagram](https://docs.gitlab.com/ee/development/img/architecture_simplified.png)

The web frontend is Nginx (which we incidentally also use in our
anarcat's avatar
anarcat committed
[howto/cache](howto/cache) system) but GitLab wrote their own reverse proxy called
[GitLab Workhorse](https://gitlab.com/gitlab-org/gitlab-workhorse/) which in turn talks to the underlying GitLab
Rails application, served by the [Unicorn](https://yhbt.net/unicorn/) application
anarcat's avatar
anarcat committed
server. The Rails app stores its data in a [howto/postgresql](howto/postgresql) database
(although not our own deployment, for now, [should be fixed](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/20)). GitLab also offloads
long-term background tasks to a tool called [sidekiq](https://github.com/mperham/sidekiq).

Those all server HTTP(S) requests but GitLab is of course also
accessible over SSH to push/pull git repositories. This is handled by
a separate component called [gitlab-shell](https://gitlab.com/gitlab-org/gitlab-shell) which acts as a shell
for the `git` user. 

Workhorse, Rails, sidekiq and gitlab-shell all talk with Redis to
store temporary information, caches and session information. They can
also communicate with the [Gitaly](https://gitlab.com/gitlab-org/gitaly) server which handles all
communication with the git repositories themselves.

anarcat's avatar
anarcat committed
### Continuous integration

GitLab also features Continuous Integration (CI). CI is handled by
[GitLab runners](https://gitlab.com/gitlab-org/gitlab-runner/) which can be deployed by anyone and registered in
the Rails app to pull CI jobs. This is documented in the [service/ci
page](service/ci).
anarcat's avatar
anarcat committed
### Spam control

TODO: document lobby.

Discuss alternatives, e.g. [this hackernews discussion about mediawiki
moving to gitlab](https://news.ycombinator.com/item?id=24919569). Their [gitlab migration](https://www.mediawiki.org/wiki/GitLab_consultation) documentation might
give us hints on how to improve the spam situation on our end.

A few ideas on tools:

 * [Tornevall blocklist](https://www.tornevall.net/about/)
 * [Mediawiki spam control tricks](https://m.mediawiki.org/wiki/Manual:Combating_spam)
 * [Friendly CAPTCHA](https://friendlycaptcha.com/), [considered for inclusion in GitLab](https://gitlab.com/gitlab-org/gitlab/-/issues/273480)

### Scalability

We have not looked a lot into GitLab scalability. Upstream has
[reference architectures](https://docs.gitlab.com/ee/administration/reference_architectures/) which explain how to scale for various
user sizes. We have not yet looked into this, and so far have just
thrown hardware at GitLab when performance issues come up.

anarcat's avatar
anarcat committed
### GitLab pages

[GitLab pages](https://gitlab.com/gitlab-org/gitlab-pages) is "a simple HTTP server written in Go, made to
anarcat's avatar
anarcat committed
serve GitLab Pages with CNAMEs and SNI using HTTP/HTTP2". In practice,
the way this works is that artifacts from GitLab CI jobs get sent back
to the central server.

GitLab pages is designed to scale horizontally: multiple pages servers
can be deployed and fetch their content and configuration through NFS.
They are [rearchitecturing this with Object storage](https://docs.gitlab.com/ee/architecture/blueprints/cloud_native_gitlab_pages/) (ie. S3
through minio by default, or external existing providers) which might
simplify running this but this actually adds complexity to a
previously fairly simple design. Note that they have tried using
CephFS instead of NFS but that did not work for some reason.

The [new pages architecture](https://docs.gitlab.com/ee/architecture/blueprints/cloud_native_gitlab_pages/) also relies on the GitLab rails API
for configuration (it was a set of JSON files before), which makes it
dependent on the Rails API for availability, although [that part of
the design](https://gitlab.com/groups/gitlab-org/-/epics/4242) has [exponential back-off time](https://gitlab.com/groups/gitlab-org/-/epics/4242) for unavailability
of the rails API, so maybe it would survive a downtime of the rails
API.
anarcat's avatar
anarcat committed

GitLab pages is not currently in use in our setup, but could be used
as an alternative to the [static mirroring system](howto/static-component). See the
anarcat's avatar
anarcat committed
[discussion there](howto/static-component#alternatives-considered) for more information about how that compares
with the static mirror system.

Update: [some tests of GitLab pages](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/91) were performed in January
2021, with moderate success. There are still concerns about the
reliability and scalability of the service, but the service could be
used for small sites at this stage. See the [GitLab pages installation
instructions](#gitlab-pages-installation) for details on how this was setup.
anarcat's avatar
anarcat committed
Note that the pages are actually on disk, in
`/var/opt/gitlab/gitlab-rails/shared/pages/GROUP/.../PROJECT`, for
example the status site pipeline publishes to:

    /var/opt/gitlab/gitlab-rails/shared/pages/tpo/tpa/status-site/

Maybe this could be abused to act as a static source in the static
mirror system?

[File][] or [search][] for issues in the [gitlab project][search].
 [File]: https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/new
 [search]: https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues
anarcat's avatar
anarcat committed
Known issues:

 * [Issues warn about LFS](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/59)
 * [Confidential issues leak cleartext by email](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/23) (see the [Note
   about confidential issues](#note-about-confidential-issues) above)
 * [Wikis are not publicly editable](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/76)
 * [Cannot move issues to projects I do not maintain](https://gitlab.com/gitlab-org/gitlab/-/issues/331610)
Monitoring right now is minimal: normal host-level metrics like disk
space, CPU usage, web port and TLS certificates are monitored by
Nagios with our normal infrastructure, as a black box.

Prometheus monitoring is built into the GitLab Omnibus package, so it
is *not* configured through our Puppet like other Prometheus
servers. It has still been (manually) integrated in our Prometheus
setup and Grafana dashboards (see [pager playbook](#pager-playbook)) have been deployed.
More work is underway to improve monitoring in [this issue](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40077) (not
hardcoding exporters). We could also use the following tools:

 * [moosh3/gitlab-alerts](https://github.com/moosh3/gitlab-alerts): autogenerate issues based from Prometheus
   Alert Manager (with the webhook)
 * [FUSAKLA/prometheus-gitlab-notifier](https://github.com/FUSAKLA/prometheus-gitlab-notifier): similar
 * 11.5 shipped [a bunch of alerts](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/45740) which we might want to use
   directly
 * the "Incident management" support has various [integrations](https://docs.gitlab.com/ee/operations/incident_management/integrations.html)
   including Prometheus (starting from 13.1) and Pagerduty (which is
   supported by Prometheus)
## Logs and metrics

<!-- TODO: where are the logs? how long are they kept? any PII? -->
<!-- what about performance metrics? same questions -->

There is a backup job (in the `git` user crontab) that makes sure to
backup the content of `/var/opt/gitlab/gitlab-rails/etc/` are backed
up. We use this instead of the backup system provided by the GitLab
Puppet module, because that is not covered by the `gitlab-backup`
command. This is implemented with the `tpo-gitlab-backup`, a simple
wrapper script which calls `gitlab-backup` and performs the
configuration backup and rotation.

It is assumed that the existing [howto/backup](howto/backup) system
will pick up those copies and store them for our normal rotation
periods.

Ideally, this rather exotic backup system would be harmonized with our
existing backup system, but this would require (for example) using our
existing PostgreSQL infrastructure ([issue 20](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/20)).
## Other documentation

 * GitLab has a [built-in help system](https://gitlab.torproject.org/help)
 * [Support forum](https://forum.gitlab.com/)

anarcat's avatar
anarcat committed
## Meetings

anarcat's avatar
anarcat committed
Some meetings about tools discussed GitLab explicitly. Those are the minutes:
anarcat's avatar
anarcat committed

- [2020 September 15th](http://meetbot.debian.net/tor-meeting/2020/tor-meeting.2020-09-15-15.00.html)
- [2020 July 7th](http://meetbot.debian.net/tor-meeting/2020/tor-meeting.2020-07-07-15.08.log.html)

## Overview

<!-- describe the overall project. should include a link to a ticket -->
<!-- that has a launch checklist -->

The GitLab project at Tor has been a long time coming. If you look at
the [history](#history) section above, you'll see it has been worked on since
at least 2016, at which point an external server was setup for the
"network team" to do code review. This server was ultimately retired.

The current server has been worked on since 2019, with the master
ticket, [issue 29400](https://gitlab.torproject.org/tpo/tpa/services/-/issues/29400), created in the footsteps of the [2019
Brussels meeting](https://trac.torproject.org/projects/tor/wiki/org/meetings/2019BrusselsAdminTeamMinutes). The service launched some time in June 2020,
with a full migration of Trac tickets.

 * replacement of the Trac issue tracking server
 * rough equivalent of Trac features in GitLab

 * identical representation of Trac issues in GitLab, including proper
   issue numbering

 * replacement of Gitolite (git hosting)
 * replacement of Gitweb (git hosting)
 * replacement of Jenkins (CI)
 * replacement of the static site hosting system

Those are not part of the first phase of the project, but it is
understood that if one of those features gets used more heavily in
GitLab, the original service MUST be eventually migrated into GitLab
and turned off. We do *not* want to run multiple similar services at
the same time (for example run both gitolite and gitaly on all git
repositories, or run Jenkins and GitLab runners).


The GitLab migration was approved at the 2019 Brussels dev meeting.
The solution to the "code review" and "project management" problems
are to deploy a GitLab instance which does *not* aim at managing all
source code, in the first stage.

Staff not evaluated.

In terms of hardware, we start with a single virtual machine and agree
that, in the worst case, we can throw a full Hetzner PX62-NVMe node at
the problem (~70EUR/mth).

GitLab is such a broad project that multiple alternatives exist for
different components:

 * GitHub
   * Pros:
     * widely used in the open source community
     * Good integration between ticketing system and code
   * Cons
     * It is hosted by a third party (Microsoft!)
     * Closed source
 * GitLab:
  * Pros:
    * Mostly free software
    * Feature-rich
  * Cons:
    * Complex software, high maintenance
    * "Opencore" - some interesting features are closed-source

### GitLab command line clients

anarcat's avatar
anarcat committed
If you want to do batch operations or integrations with GitLab, you
might want to use one of those tools, depending on your environment or
prefered programming language:

 * [bugwarrior](https://github.com/ralphbean/bugwarrior) ([Debian](https://tracker.debian.org/pkg/bugwarrior)) - support for GitLab, GitHub and
   other bugtrackers for the [taskwarrior](http://taskwarrior.org/) database
 * [git-lab](https://invent.kde.org/sdk/git-lab) - python commandline client, lists, pulls MR; creates
   snippets
anarcat's avatar
anarcat committed
 * [GitLab-API-v4](https://metacpan.org/release/GitLab-API-v4) ([Debian](https://tracker.debian.org/pkg/libgitlab-api-v4-perl)) - perl library and [commandline
   client](https://manpages.debian.org/buster/libgitlab-api-v4-perl/gitlab-api-v4.1p.en.html)
 * [GitLabracadabra](https://gitlab.com/gitlabracadabra/gitlabracadabra) ([Debian](https://tracker.debian.org/pkg/gitlabracadabra)) - *configure a GitLab instance
   from a YAML configuration, using the API*: project settings like
   labels, admins, etc
anarcat's avatar
anarcat committed
 * [glab](https://github.com/profclems/glab) ([not in Debian](https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=969482)) - inspired by GitHub's official `gh`
   client
 * [python-gitlab](https://github.com/python-gitlab/python-gitlab) (also known as `gitlab-cli` in [Debian](https://tracker.debian.org/pkg/python-gitlab))
anarcat's avatar
anarcat committed
 * [ruby-gitlab](https://github.com/narkoz/gitlab) ([Debian](https://tracker.debian.org/pkg/ruby-gitlab)), also includes a [commandline
   client](https://narkoz.github.io/gitlab/)
 * [salsa](https://manpages.debian.org/buster/devscripts/salsa.1.en.html) (in [Debian devscripts](https://tracker.debian.org/pkg/devscripts)) is specifically built for
   salsa but might be coerced into talking to other GitLab servers
GitLab upstream has a [list of third-party commandline tools](https://about.gitlab.com/partners/#cli-clients) that
is interesting as well.

### Migration tools

ahf implemente the gitlab using his own home-made tools that talk to
the GitLab and Trac API. but there's also [tracboat](https://github.com/tracboat/tracboat) which is
designed to migrate from trac to GitLab.

We did not use Tracboat because it uses gitlab's DB directly and thus
only works with some very specific version. Each time the database
schema changes at GitLab, Tracboat needs to port to it. We prefered to
use something that talked with the GitLab API.

We also didn't like the output entirely, so we modified it but still
used some of its regular expressions and parser.