title: TPA-RFC-14: GitLab artifacts expiry change
affected users: GitLab users
deadline: 2021-11-25
status: standard
discussion: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40516
Summary: GitLab artifacts used to be deleted after 30 days. Now they
will be deleted after 14 days. Latest artifacts are always kept. That
expiry period can be changed with the artifacts:expire_in
field in
.gitlab-ci.yml
.
What
We will soon change the retention period for artifacts produced by GitLab CI jobs. By default, GitLab keeps artifacts to 30 days (~four weeks), but we will lower this to 14 days (two weeks).
Latest artifacts for all pipelines are kept indefinitely regardless of
this change. Artifacts marked Keep
on a job page will also still be
kept.
For individual projects, GitLab doesn't display how much space is
consumed only by CI artifacts, but the Storage
value on the landing
page can be used as an indicator since their size is included in this
total.
Why
Artifacts are using a lot of disk space. At last count we had 300GB of artifacts and were gaining 3GB per day.
We have already grown the GitLab server's disk space to accommodate that growth, but it has already filled up.
It is our hope that this change will allow us to avoid growing the disk indefinitely and will make it easier for TPA to manage the growing GitLab infrastructure in the short term.
How
The default artifacts expiration timeout will be changed from 30
days to 14 days in the GitLab administration panel. If you wish to
override that setting, you can add a artifacts:expire_in setting
in your .gitlab-ci.yml
file.
This will only affect new jobs. Artifacts of jobs created before the change will expire after 30 days, as before.
Note that you are also encouraged to set a lower setting for artifacts that do not need to be kept. For example, if you only keep artifacts for a deployment job, it's perfectly fine to use:
expire_in: 1 hour
It is speculated that the Jenkins migration is at least partly responsible for the growth in disk usage. It is our hope that the disk usage growth will slow down as that migration completes, but we are conscious that GitLab is being used more and more by all teams and that it's entirely reasonable that the artifacts storage will keep growing indefinitely.
We also looking at long-term storage problems and GitLab scalability issues in parallel to this problem. We have disk space available in the mid-term, but we are considering using that disk space to change filesystems which would simplify our backup policies and give us more disk space. The artifacts policy change is mostly to give us some time to breathe before we throw all the hardware we have left at the problem.
If your project is unexpectedly using large amounts of storage and CI artifacts is suspected as the cause, please get in touch with TPA so we can work together to fix this. We should be able to manually delete these extraneous artifacts via the GitLab administrator console.
References
- ticket 40516: bug report about artifacts filling up disks
- GitLab scalability issues
- long-term storage problems
- artifacts:expire_in setting
- default artifacts expiration setting