Skip to content
Snippets Groups Projects

Roll call: who's there and emergencies

Present: anarcat, hiro, weasel.

Small emergency with Gitlab.

Gitlab

We realized that the GitLab backups were not functionning properly because GitLab omnibus runs its own database server, separate from the one ran by TPA. In the long term, we want to fix this, but in the short term, the following should be done:

  1. that it works without filling up the disk ;) (probably just a matter of rotating the backups)
  2. that it backs up everything (including secrets)
  3. that it stores the backup files offsite (maybe using bacula)
  4. that it is documented

The following actions were undertaken:

  • make new (rotating disk) volume to store backups, mount it some place (weasel; done)
  • tell bacula to ignore the rest of gitlab /var/opt/.nobackup in puppet (hiro; done)
  • make the (rotating) cronjob in puppet, including the secrets in ./gitlab-rails/etc (hiro, anarcat; done)
  • document ALL THE THINGS (anarcat) - specifically in a new page somewhere under howto/backup, along with more generic gitlab documentation (34425)

Roadmap review

We proceeded with a review of the May and June roadmap.

We note that this roadmap system will go away after the gitlab migration, after which point we will experiment with various gitlab tools (most notably the "Boards" feature) to organize work.

alex will ask hiro or weasel to put trac offline, we keep filing tickets in Trac until then.

weasel has taken on the kvm/ganeti migration:

hiro will try creating the next ganeti node to get experience on that 34304.

anarcat should work on documentation, examples:

Availability planning

We are thinking of setting up an alternating schedule where hiro would be available Monday to Wednesday and anarcat from Wednesday to Friday, but we're unsure this will be possible. We might just do it on a week by week basis instead.

We also note that anarcat will become fully unavailable for two months starting anywhere between now and mid-july, which deeply affects the roadmap above. Mainly, anarcat will focus on documentation and avoid large projects.

Other discussions

We discussed TPA-RFC-2, "support policy" (policy/tpa-rfc-2-support), during the meeting, because someone asked if they could contact us over signal (the answer is "no").

The policy seemed to be consistent with what people in the meeting expected and it will be sent for approval to tor-internal shortly.

Next meeting

TBD. First wednesday in July is a bank holiday in Canada so it's not a good match.

Metrics of the month

  • hosts in Puppet: 74, LDAP: 77, Prometheus exporters: 128
  • number of apache servers monitored: 29, hits per second: 163
  • number of nginx servers: 2, hits per second: 2, hit ratio: 0.88
  • number of self-hosted nameservers: 6, mail servers: 12
  • pending upgrades: 35, reboots: 48
  • average load: 0.55, memory available: 346.14 GiB/952.95 GiB, running processes: 428
  • bytes sent: 207.17 MB/s, received: 111.78 MB/s
  • planned buster upgrades completion date: 2020-08-18

Upgrade prediction graph still lives at https://gitlab.torproject.org/anarcat/wikitest/-/wikis/howto/upgrades/

Now also available as the main Grafana dashboard. Head to https://grafana.torproject.org/, change the time period to 30 days, and wait a while for results to render.