Skip to content
Snippets Groups Projects

Roll call: who's there and emergencies

anarcat, gaba, kez, lavamind

OKRs and 2022 roadmap

Each team has been establishing their own Objectives and Key Results (OKRs), and it's our turn. Anarcat has made a draft of five OKRs that will be presented at the October 20th all hands meeting.

We discussed switching to this process for 2022 and ditch the previous roadmap process we had been using. The OKRs would then become a set of objectives for the first half of 2022 and be reviewed mid-year.

The concerns raised were that the OKRs lack implementation details (e.g. linked tickets) and priorities (ie. "Must have", "Need to have", "Non-objectives"). Anarcat argued that implementation details will be tracked in GitLab Milestones linked from the OKRs. Priorities can be expressed by ordering the Objectives in the list.

We observed that the OKRs didn't have explicit objectives for the web part of TPA, and haven't found a solution to the problem yet. We have tried adding an objective like this:

Integrate web projects into TPA

  1. TPA is triaging the projects lego, ...?
  2. increase the number of projects that deploy from GitLab
  3. create and use gitlab-ci templates for all web projects

... but then realised that this should actually happen in 2021-Q4.

At this point we ran out of time. anarcat submitted TPA-RFC-13 to followup.

Can we add those projects under TPA's umbrella?

Make sure we have maintainers for, and that those projects are triaged:

  • lego project (? need to find a new maintainer, kez/lavamind?)
  • research (Roger, mike, gus, chelsea, tariq, can be delegated)
  • civicrm (OpenFlows, and anarcat)
  • donate (OpenFlows, duncan, and kez)
  • blog (lavamind and communications)
  • newsletter (anarcat with communications)
  • documentation

Not for tpa:

  • community stays managed by gus
  • tpo stays managed by gus
  • support stays managed by gus
  • manual stays managed by gus
  • styleguides stays managed by duncan
  • dev still being developed
  • tor-check : arlo is the maintainer

The above list was reviewed between gaba and anarcat before the meeting, and this wasn't explicitly reviewed during the meeting.

Dashboard triage

Delegated to the star of the weeks.

Other discussions

Those discussion points were added during the meeting.

post-mortem of the week

We had a busy two weeks, go over how the emergencies went and how we're doing.

We unfortunately didn't have time to do a voice check-in on that, but we will do one at next week's checkin.

Q4 roadmap review

We discussed re-reviewing the priorities for Q4 2022, because there was some confusion that the OKRs would actually apply there; they do not: the previous work we did on prioritizing Q4 still stands and this point doesn't need to be discussed.

Next meeting

We originally discussed bringing those points back on Tuesday oct 19th, 19:00 UTC, but after clarification it is not required and we can meet next month as usual which, according to the Nextcloud calendar, would be Monday November 1st, 17:00UTC, which equivalent to: 10:00 US/Pacific, 13:00 US/Eastern, 14:00 America/Montevideo, 18:00 Europe/Paris.

Metrics of the month

Numbers and tickets

  • hosts in Puppet: 91, LDAP: 94, Prometheus exporters: 145
  • number of Apache servers monitored: 28, hits per second: 147
  • number of Nginx servers: 2, hits per second: 2, hit ratio: 0.82
  • number of self-hosted nameservers: 6, mail servers: 7
  • pending upgrades: 2, reboots: 0
  • average load: 0.82, memory available: 3.63 TiB/4.54 TiB, running processes: 592
  • bytes sent: 283.86 MB/s, received: 169.12 MB/s
  • planned bullseye upgrades completion date: ???
  • GitLab tickets: 156 tickets including...
    • open: 0
    • icebox: 127
    • backlog: 13
    • next: 7
    • doing: 4
    • needs information: 5
    • needs review: 0
    • (closed: 2438)

Compared to last month, we have reduced our backlog and kept "next" and "doing" quite tidy. Our "needs information" is growing a bit too much to my taste, not sure how to handle that growth other than to say: if TPA puts your ticket in the "needs information" state, it's typically that you need to do something before it gets resolved.

Bullseye upgrades

We started tracking bullseye upgrades! The upgrade prediction graph now lives at:

https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/upgrades/bullseye#per-host-progress

I concede it looks utterly ridiculous right now, and the linear predictor gives ... "suspicious" results:

anarcat@angela:bullseye(master)$ make
predict-os refresh
predict-os predict graph -o predict.png --path data.csv --source buster
/home/anarcat/bin/predict-os:123: RankWarning: Polyfit may be poorly conditioned
  date = guess_completion_time(records, args.source, now)
suspicious completion time in the past, data may be incomplete: 1995-11-09
completion time of buster major upgrades: 1995-11-09

In effect, we have not upgraded a single box to bullseye, but we have created 4 new machines, and those are all running bullseye.

An interesting data point: about two years ago, we had 79 machines (compared to 91 today), 1 running jessie (remember the old check.tpo?), 38 running stretch, and 40 running buster. We never quite completed the stretch upgrade (we still have one left!), but we reached that around a year ago. So, in two years, we added 12 new machines to the fleet, for an average of a new machine every other month.

If we look at the buster upgrade process, we will completely miss the summer milestone, when Debian buster will reach EOL itself. But do not worry, we do have a plan, stay tuned!