This page documents the mid-term plan for TPA in the year 2022.
Previous roadmaps were done in a quarterly and yearly basis, but starting this year we are using the OKR system to establish, well, Objectives and Key Results. Those objectives are set for a 6 months period, so they cover two quarters and are therefore established reviewed twice a year.
Objectives and Key Results
Each heading below here is an objective and the items below are key results that will allow us to measure whether the objectives were met mid-year 2022. As a reminder, those are supposed to be ambitious: we do not expect to do everything here and instead aim for the 60-70% mark.
Note that TPA also manages another set of OKRs, the web team OKRs which are also relevant here, in the sense that the same team is split between the two sets of OKRs.
Improve mail services
- David doesn't complain about "mail getting into spam" anymore
- RT is not full of spam
- we can deliver and receive mail from state.gov
Retire old services
- SVN is retired and people are happy with the replacement
- establish a plan for gitolite/gitweb retirement
- retire schleuder in favor of ... official Signal groups? ... mailman-pgp? RFC2549 with one-time pads?
Cleanup and publish the sysadmin code base
- sanitize and publish the Puppet git repository
- implement basic CI for the Puppet repository and use a MR workflow
- deploy dynamic environments on the Puppet server to test new features
Upgrade to Debian 11 "bullseye"
- all machines are upgraded to bullseye
- migrate to Prometheus for monitoring (or upgrade to Inciga 2)
- upgrade to Mailman 3 or retire it in favor of Discourse (!)
Provision a new, trusted high performance cluster
- establish a new PoP on the US west coast with trusted partners and hardware ($$)
- retire moly and move the DNS server to the new cluster
- reduce VM deployment time to one hour or less (currently 2 hours)
Non-objectives
Those things will not be done during the specified time frame:
- LDAP retirement
- static mirror system retirement
- new offsite backup server
- complete email services (e.g. mailboxes)
- search.tpo/SolR
- web metrics
- user survey
- stop global warming
Quarterly reviews
Q1
We didn't do much in the TPA roadmap, unfortunately. Hopefully this week will get us started with the bullseye upgrades, and some initiatives have been started but it looks like we will probably not fulfill most (let alone all) of our objectives for the roadmap inside TPA.
(From the notes of the 2022-04-04 meeting.)
Q3-Q4
This update was performed by anarcat over email on 2022-10-11, and covers work done over Q1 to Q3 and part of Q4. It also tries to venture a guess as to how much of the work could actually be completed by the end of the year.
Improve mail services: 30%
We're basically stalled on this. The hope is that TPA-RFC-31 comes through and we can start migrating to an external email service provider at some point in 2023.
We did do a lot of work on improving spam filtering in RT, however. And a lot of effort was poured into implementing a design that would fix those issues by self-hosting our email (TPA-RFC-15), but that design was ultimately rejected.
Let's call this at 30% done.
Retire old services: 50%, 66% possible
SVN hasn't been retired, and we couldn't meet in Ireland to discuss how it could be. It's likely to get stalled until the end of the year; maybe a proposal could come through, but SVN will likely not get retired in 2022.
For gitolite/gitweb, I started TPA-RFC-36 and started establishing requirements. The next step is to propose a draft, and just move it forward.
For schleuder, the only blocker is the community team, there is hope we can retire this service altogether as well.
Calling this one 50% done, with hope of getting to 2/3 (66%).
Cleanup and publish the sysadmin code base: 0%
This is pretty much completely stalled, still.
Upgrade to Debian 11 "bullseye": 87.5% done, 100% possible
- all machines are upgraded to bullseye
- migrate to Prometheus for monitoring (or upgrade to Inciga 2)
- upgrade to Mailman 3 or retire it in favor of Discourse (!)
Update: we're down to 12 buster machines, out of about 96 boxes total, which is 87.5% done. The problem is we're left with those 12 hard machines to upgrade:
- sunet cluster rebuild (4)
- moly machines retirement / rebuild (4)
- "hard" machines: alberti, eugeni, nagios, puppet (4)
There can be split into buckets:
- just do it (7):
- sunet
- alberti
- eugeni (modulo schleuder retirement, probably a new VM for mailman? or maybe all moved to external, based on TPA-RFC-31 results)
- puppet (yes, keeping Puppet 5 for now)
- policy changes (2):
- nagios -> prometheus?
- schleuder/mailman retirements or rebuilds
- retirements (3):
- build-x86-XX (2)
- moly
So there's still hope to realize at least the first key result here, and have 100% of the upgrades done by the end of year, assuming we can get the policy changes through.
Provision a new, trusted high performance cluster: 0%, 60% possible
This actually unblocked recently, "thanks" to the mess at Cymru. If we do manage to complete this migration in 2022, it would get us up to 60% of this OKR.
Non-objectives
None of those unplanned things were done, except the "complete email services" is probably going to be part of the TPA-RFC-31 spec.
Editorial note
Another thing to note is that some key results were actually split between multiple objectives.
For example, the "retire moly and move the DNS server to a new cluster" key result is also something that's part of the bullseye upgrade objectives.
Not that bad, but something to keep in mind when we draft the next ones.
How those were established
The goals were set based on a brainstorm by anarcat but that was also based on roadmap items from the 2021 roadmap that were not completed. We have not ran a survey this year around, because we still haven't responded to everything that was told the last time. It was also felt that the survey takes a long time to process (for us) and respond to (for everyone else).
The OKRs were actually approved in TPA-RFC-13 after a discussion in a meeting as well. See also issue 40439 and the establish the 2022 roadmap milestone.