Verified Commit 51d7e1ac authored by anarcat's avatar anarcat
Browse files

review GitLab backup documentation

We do not backup repositories in the job anymore, and clarify what is where
parent 46a6e127
Loading
Loading
Loading
Loading
+35 −16
Original line number Diff line number Diff line
@@ -943,30 +943,49 @@ hardcoding exporters). We could also use the following tools:

There is a backup job ( `tpo-gitlab-backup`, in the `root` user
crontab) that is a simple wrapper script which calls `gitlab-backup`
to dump all components of the GitLab installation (except artifacts!)
in the backup directory (`/srv/gitlab-backup`).

GitLab also creates a backup on upgrade. Those are purged after two
weeks by the wrapper script.
to dump some components of the GitLab installation in the backup
directory (`/srv/gitlab-backup`).

The backup system is deployed by Puppet and (*at the time of
writing*!) **skips** *repositories** and **artifacts**. It contains:

 * GitLab CI build logs (`builds.tar.gz`)
 * a compressed database dump (`db/database.sql.gz`)
 * Git Large Files (Git LFS, `lfs.tar.gz`)
 * packages (`packages.tar.gz`)
 * GitLab pages (`pages.tar.gz`)
 * some terraform thing (`terraform_state.tar.gz`)
 * uploaded files (`uploads.tar.gz`)

The backup job is ran nightly.  GitLab also creates a backup on
upgrade. Those are purged after two weeks by the wrapper script.

The backup job does **NOT** contain those components because they take
up a tremendous amount of disk space, and are already backed up by
Bacula. Those need to be restored from the regular backup server,
separately:

 * Git repositories (found in
   `/var/opt/gitlab/git-data/repositories/`)
 * GitLab CI artifacts (normally found in
   `/var/opt/gitlab/gitlab-rails/shared/artifacts/`, in our case
   bind-mounted over `/srv/gitlab-shared/artifacts`)

It is assumed that the existing [howto/backup](howto/backup) system
will pick up those copies and store them for our normal rotation
periods.

Artifacts are not backed up because they are already backed up by
Bacula in the normal job and take a tremendous amount of disk
space. It should also be noted that most of the files provided by
`gitlab-backup` are *also* already backed up by Bacula and are
therefore duplicated on the backup storage server. See [issue 40518][]
to followup on that.
will pick up those files, but also the actual backup files in
`/srv/gitlab-backup` and store them for our normal rotation periods.

[issue 40518]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40518
This implies that the files covered by the `gitlab-backup` job are
*also* already backed up by Bacula and are therefore duplicated on the
backup storage server. See [issue 40518][] to review that strategy.

Ideally, this rather exotic backup system would be harmonized with our
existing backup system, but this would require (for example) using our
existing PostgreSQL infrastructure ([issue 20](https://gitlab.torproject.org/tpo/tpa/gitlab/-/issues/20)). Other ideas
existing PostgreSQL infrastructure ([issue 20][]). Other ideas
(including filesystem snapshots) are also in [issue 40518][].

[issue 40518]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40518

## Other documentation

 * GitLab has a [built-in help system](https://gitlab.torproject.org/help)