diff --git a/meeting.md b/meeting.md index d2734e5a4b014455c807c50cf769112910babaa2..314c21f67953d8b6570e2957dc6cc3d4648f62dc 100644 --- a/meeting.md +++ b/meeting.md @@ -27,5 +27,6 @@ We keep minutes of our meetings here. * [2021-02-02](meeting/2021-02-02) * [2021-03-02](meeting/2021-03-02) * [2021-04-07](meeting/2021-04-07) (report only) + * [2021-05-03](meeting/2021-05-03) (report only) * [monthly-report](meeting/monthly-report) * [template](meeting/template) diff --git a/meeting/2021-05-03.md b/meeting/2021-05-03.md new file mode 100644 index 0000000000000000000000000000000000000000..6164129873f56f55b196b9242494149c41a56a8a --- /dev/null +++ b/meeting/2021-05-03.md @@ -0,0 +1,107 @@ +As with the previous month, I figured I would show a sign of life here +and try to keep you up to date with what's happening in sysadmin-land, +even though we're not having regular meetings. I'm still experimenting +with structure here, and this is totally un-edited, so please bear +with me. + +# Important announcements + +You might have missed this: + + * Jenkins will be retired in December 2021, and it's time to move + your jobs away + * if you want old Trac wiki redirects to go to the right place, do + let us know, see [ticket 40233][] + * we do not have ARM 32 builders anymore, the last one was shut down + recently ([ticket 32920][]) and they had been removed from CI + (Jenkins) anyways before that. the core team is looking at + alternatives for building Tor on armhf in the future, see [ticket + 40347][] + * we have setup a Prometheus Alertmanager during the hack week, which + means we can do alerting based on Prometheus metrics, see the + [altering documentation][] for more information + +As usual, if you have any questions, comments, or issues, please do +contact us following this "how to get help" procedure: + +<https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-2-support#how-to-get-help> + +Yes, that's a terrible URL. Blame GitLab. :) + +[altering documentation]: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/prometheus#alerting +[ticket 40347]: https://gitlab.torproject.org/tpo/core/tor/-/issues/40347 +[ticket 32920]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/32920 +[ticket 40233]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40233 + +# Crash of the month + +Your sysadmin crashed a Ganeti node, creating a split-brain scenario +([ticket 40229][]). He would love to say that was planned and a +routine exercise to test the documentation but (a) it wasn't and (b) +the document had to be made up as he went, so that was actually a +stressful experience. + +Remember kids: never start a migration before the weekend or going to +bed unless you're willing and ready to stay up all night (or +weekend). + +[ticket 40229]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40229 + +# Metrics of the month + + * hosts in Puppet: 86, LDAP: 89, Prometheus exporters: 140 + * number of Apache servers monitored: 28, hits per second: 147 + * number of Nginx servers: 2, hits per second: 2, hit ratio: 0.86 + * number of self-hosted nameservers: 6, mail servers: 7 + * pending upgrades: 1, reboots: 0 + * average load: 0.68, memory available: 2.00 TiB/2.77 TiB, running processes: 552 + * bytes sent: 276.43 MB/s, received: 162.75 MB/s + * [GitLab tickets][]: ? tickets including... + * open: 0 + * icebox: 109 + * backlog: 15 + * next: 2 + * doing: 2 + * (closed: 2266) + + [Gitlab tickets]: https://gitlab.torproject.org/tpo/tpa/team/-/boards + +# Ticket analysis + +Here's an update of the ticket table, which we last saw in February: + +| date | open | icebox | backlog | next | doing | closed | delta | +|------------|------|--------|---------|------|-------|--------|-------| +| 2020-07-01 | 125 | 0 | 26 | 13 | 7 | 2075 | | +| 2020-11-18 | 1 | 84 | 32 | 5 | 4 | 2119 | 49 | +| 2020-12-02 | 0 | 92 | 20 | 9 | 8 | 2130 | 11 | +| 2021-01-19 | 0 | 91 | 20 | 12 | 10 | 2165 | 35 | +| 2021-02-02 | 0 | 96 | 18 | 10 | 7 | 2182 | 17 | +| 2021-03-02 | 0 | 107 | 15 | 9 | 7 | 2213 | 31 | +| 2021-04-07 | 0 | 106 | 22 | 7 | 4 | 2225 | 12 | +| 2021-05-03 | 0 | 109 | 15 | 2 | 2 | 2266 | 41 | + +I added a "delta" column which shows how many additional tickets were +closed since the previous period. April is our record so far, with a +record of 41 tickets closed in less than 30 days, more than one ticket +per day! + +In other news, the Icebox keeps growing, which should keep us cool and +breezy during the northern hemisphere summer that's coming up, but at +least the Backlog is not growing too wildly, and the actual current +queue (Next/Doing) is pretty reasonable. So things seem to be under +control, but the new hiring process is taking significant time so this +might upset our roadmap a little. + +# Ticket of the month + +Ticket [40218][] tracks the progress of the CI migration from Jenkins +to GitLab CI. Jenkins is scheduled for retirement in December 2021, +and progress has been excellent, with the network team actually +*asking* for the Jenkins jobs to be disabled ([ticket 40225](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40225)) +which, if it gets completed, will means the retirement of 4 virtual +machines already. + +Exciting cleanup! + +[40218]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40218