gitlab is failing to deliver mail notifications and other issues (queuing problems?)
it looks like queues are not processing.
this problem has other symptoms, possibly non exhaustive list of broken things:
- mail notifications
- IRC notifications
- scheduled ci jobs (e.g. https://gitlab.torproject.org/tpo/tpa/triage-ops/-/pipelines)
- merging merge requests
Steps to reproduce
comment on an issue
What is the current bug behavior?
mail notifications are not sent.
What is the expected correct behavior?
mail notifications should be sent.
When did this start?
unclear. first reported at 10UTC on #tor-admin by @gk
@ahf mentioned the scheduled CI jobs:
09:00:54 <ahf> died somewhere within 5-15 min. after Jun 25, 2025, 09:27 GMT+2 (07:27 UTC) today
Relevant logs and/or screenshots
nothing in the sidekiq or mailroom logs, followed the "outgoing mail" pager playbook and gitlab-console is able to send mail fine, so this is a queuing issue.
Possible fixes
there's a DiskWillFillSoon alert in karma that could be the root cause here, @lelutin is looking into that.
Next steps
-
updating status.tpo -
remove old volume group on gitlab-02 -
remove extra disk in ganeti -
rename new volume group to remove hdd
suffix (it's on SSD like everything else) -
regroup all LVs into one / stop using VGs? (possibly better to delegate to a later "rebuild gitlab-02", see #42218 (comment 3217801))not worth it -
monitoring needed for sidekiq (and dashboards?) -
timeline -
monitoring for disk space? chore-level#42218 (comment 3217901) -
root cause analysis, see discussion in #42218 (comment 3217883), likely limitation in sidekiq fixed by restarting it
Edited by anarcat