TPA team issueshttps://gitlab.torproject.org/tpo/tpa/team/-/issues2024-03-28T13:14:07Zhttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41565Puppet fighting over /etc/apt/sources.list2024-03-28T13:14:07ZJérôme Charaouilavamind@torproject.orgPuppet fighting over /etc/apt/sources.listOur current Puppet is set up to purge (delete) `/etc/apt/sources.list`.
The problem is, APT::Periodic (`apt-daily.service`) is recreating it every day: the file's creation time concurs with the schedule of`apt-daily.timer` (as seen with...Our current Puppet is set up to purge (delete) `/etc/apt/sources.list`.
The problem is, APT::Periodic (`apt-daily.service`) is recreating it every day: the file's creation time concurs with the schedule of`apt-daily.timer` (as seen with `systemctl list-timers apt-daily.timer`). The creation of the file can also be reproduced manually:
```
# rm /var/lib/apt/periodic/*-stamp
# /usr/lib/apt/apt.systemd.daily update
# ls -l /etc/apt/sources.list
```
The file being purged every day leads to frequent and unnecessary triggers of `apt update`, and this sometimes even causes Puppet run failures which show up in monitoring:
```
puppet-agent[3791032]: (/Stage[main]/Apt/File[sources.list]/ensure) removed (corrective)
puppet-agent[3791032]: (/Stage[main]/Apt::Update/Exec[apt_update]/returns) Reading package lists...
puppet-agent[3791032]: (/Stage[main]/Apt::Update/Exec[apt_update]/returns) E: Could not get lock /var/lib/apt/lists/lock. It is held by process 3791201 (apt-get)
puppet-agent[3791032]: (/Stage[main]/Apt::Update/Exec[apt_update]/returns) E: Unable to lock directory /var/lib/apt/lists/
puppet-agent[3791032]: (/Stage[main]/Apt::Update/Exec[apt_update]) Failed to call refresh: '/usr/bin/apt-get update' returned 100 instead of one of [0]
puppet-agent[3791032]: (/Stage[main]/Apt::Update/Exec[apt_update]) '/usr/bin/apt-get update' returned 100 instead of one of [0]
puppet-agent[3791032]: Applied catalog in 14.75 seconds
systemd[1]: puppet-run.service: Main process exited, code=exited, status=6/NOTCONFIGURED
systemd[1]: puppet-run.service: Failed with result 'exit-code'.
```Jérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41539Create an operations email list2024-03-28T01:14:48Zal smithCreate an operations email listThe operations team needs an email list to coordinate its work. (This will help with our grants@torproject.org email issues, as we'll be able to reduce the number of people using that alias once the operations list is established.)
**Re...The operations team needs an email list to coordinate its work. (This will help with our grants@torproject.org email issues, as we'll be able to reduce the number of people using that alias once the operations list is established.)
**Requirements**
1. Does **not** require a moderation queue
2. Allows people who are not subscribed to the list to send email to the list **without friction**
3. Is not archived (for anyone, including members of the list)
4. Is not displayed on lists.torproject.org
Is that something a list can do?
If so, we request `tor-operations@` to be created. :smile:
Note: It's possible that an operations list exits already, per this ticket from 8 years ago, but I don't think so based on my quick test. Just adding for due diligence since I noticed it: https://gitlab.torproject.org/tpo/tpa/team/-/issues/15992Jérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.org2024-03-31https://gitlab.torproject.org/tpo/tpa/team/-/issues/41517retire majus2024-03-27T15:22:59Zanarcatretire majusin #40692, we upgraded majus to Debian bookworm, yay!
but unfortunately, the transifex-client package has been removed from debian, so it won't be covered by security support and has not been upgraded to the latest release.
so we need...in #40692, we upgraded majus to Debian bookworm, yay!
but unfortunately, the transifex-client package has been removed from debian, so it won't be covered by security support and has not been upgraded to the latest release.
so we need to find a solution for this. i favor completely retiring the host and @emmapeel seemed opened to the idea in https://gitlab.torproject.org/tpo/tpa/team/-/issues/41252#note_2993038
so what's the next step here, @emmapeel you were saying there were some leftovers to garbage-collect?
checklist:
1. [x] announcement
2. [x] nagios
3. [x] retire the host in fabric
4. [x] remove from LDAP with `ldapvi`
5. [x] power-grep
6. [x] remove from tor-passwords
7. [x] remove from DNSwl
8. [x] remove from docs
9. [x] remove from racks
10. [x] remove from reverse DNSold service retirement 2023anarcatanarcathttps://gitlab.torproject.org/tpo/tpa/team/-/issues/40577Migrate {anonticket, gitlab}.onionize.space to TPO infrastructure2024-03-27T14:10:44ZAlexander Færøyahf@torproject.orgMigrate {anonticket, gitlab}.onionize.space to TPO infrastructureBefore I went AFK during most of last year, we had some discussions about moving the anonticket and Gitlab sign-up portal moved to TPO infrastructure *eventually*. We didn't deem it urgent as it was just to test things, but now the syste...Before I went AFK during most of last year, we had some discussions about moving the anonticket and Gitlab sign-up portal moved to TPO infrastructure *eventually*. We didn't deem it urgent as it was just to test things, but now the system have been running "fine" for something around 9 months, and I can happily say the administration burden of running this have been so minimal that I can't remember having done anything since I set it up.
The two application are for simplification reasons currently running with sqlite as its database as the number of data records are usually trimmed in there by the moderators, but I think when we move it over we should switch to postgresql for backup reasons and generally maintainability.
The applications are written in Python 3 using the Django framework. I have not tested them with the Debian packages of Django, but it is instead using "what Pip brought me of the day".
This is by no means urgent, but let's keep it on our radar.
What would be the first steps to get this going?
(I don't know which label this belongs to, so leaving it entirely untriaged here)Jérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41213lock down legacy git infrastructure2024-03-26T20:49:37Zanarcatlock down legacy git infrastructureAs part of the Gitolite retirement procedure (TPA-RFC-36, #41180), lock Gitolite repositories without any changes in the last
two years, preventing any further change.As part of the Gitolite retirement procedure (TPA-RFC-36, #41180), lock Gitolite repositories without any changes in the last
two years, preventing any further change.legacy Git infrastructure retirement (TPA-RFC-36)anarcatanarcat2024-01-31https://gitlab.torproject.org/tpo/tpa/team/-/issues/41564Install make and newer golang in rdsys-test-012024-03-26T18:18:36Zmeskiomeskio@torproject.orgInstall make and newer golang in rdsys-test-01I need make and golang>=1.21 (is available in bookworm-backports) to build some binaries for testing on the server Can you install it?
Thank you.I need make and golang>=1.21 (is available in bookworm-backports) to build some binaries for testing on the server Can you install it?
Thank you.Jérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41561Update sarthikg's PGP key2024-03-25T13:53:37ZGeorg KoppenUpdate sarthikg's PGP key```
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Sarthik's laptop with the old PGP key on it died the other day, thus he needed
to create a new one:
C63B F870 B219 08AD 2A58 2C85 A347 8949 750F 08BE
which can be found at:
https:/...```
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
Sarthik's laptop with the old PGP key on it died the other day, thus he needed
to create a new one:
C63B F870 B219 08AD 2A58 2C85 A347 8949 750F 08BE
which can be found at:
https://sarthikg.com/gpg.txt
I verified the fingerprint via a call which is probably better than nothing... :)
-----BEGIN PGP SIGNATURE-----
iQIzBAEBCgAdFiEEvoH49M1O0+io5Dsi74N5QXAIgZAFAmYBSsIACgkQ74N5QXAI
gZBpxA//etoEfrRnWRy6bnzcmuHKhlyENn4u07UjGNM0KTfmHPanh3Lh3ATfMcYk
eflJ62YveJWliRjTywNLqdlFT25f72AfpVvDF7K67w8CqhncpyZrdQ6h4ZCTpEln
eyiDhI1dcD/+2bpG6EX+SSdTvuMroS9haPKzBozL6WpjJXVC48wOz20ixKHvu8u6
G/S6IJ2dGrJarISfKuY+/ltfGjKS9yjU+ahXosLfAYoVmxWB3/bXKLo8MoKO3aUM
kgBviqiGr/nn9b9n4nxDmuuFjSIEKEvDYrzEyZUSa6Zb3l/7ywWuG6DDeBi7kG6Y
+Fub+bCVV18BBkTIQLMiMvq3VmLNrMNdLwlIzuBvH5Ipvo0K6zIsQ/8rJlc71jr2
ep7+muHkEtB5UQ9MeZ2VEVkQZsfltnZ88+L8QaUkHHM2Twrg1fj9AIHeQ1pgYRSv
KrckrcXbbk7ebLdFZ6NMj1uFQb/1WmqgKnaMwtGt9Qwy7IkBD+fBDs2JgvVsqPnS
59R3VOfXc2NW+qB8wGN0IHOlVtZkK21w9w7Jdz/SxAxEz8/7t7fZAxPf6gU3U+Zs
1FNBgnsZDnsfU9cvt2vzYS6+ViRn6AZUbDkVt+iVYjoxAZ4Uj4g98wXV7JP4JNsT
FwBsIo2rrBZd3LsVOAQk3SAqELcueagb9dUb0sQIsTVkphyccvA=
=WdtS
-----END PGP SIGNATURE-----
```
/cc @sarthikgJérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41556Deploy Tor Weather 2.02024-03-25T13:18:01ZGeorg KoppenDeploy Tor Weather 2.0@sarthikg has re-written Tor Weather (yay!) and we want to deploy the 2.0 version now. There are architectural changes as well that e.g. do not need any onionoo-service and onionoo-timer jobs anymore. Morevover, IIRC we need to do some d...@sarthikg has re-written Tor Weather (yay!) and we want to deploy the 2.0 version now. There are architectural changes as well that e.g. do not need any onionoo-service and onionoo-timer jobs anymore. Morevover, IIRC we need to do some database migrations as well. So, this will be exciting I guess. :)Jérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41214review gitolite retirement progress and send a reminder2024-03-20T18:56:12Zanarcatreview gitolite retirement progress and send a reminderAs part of the Gitolite retirement procedure (TPA-RFC-36, #41180), review the progress of the migration and send a reminder:
- [x] how many repositories are left to migrate, populating #41215 with the result
- [x] did any repository get...As part of the Gitolite retirement procedure (TPA-RFC-36, #41180), review the progress of the migration and send a reminder:
- [x] how many repositories are left to migrate, populating #41215 with the result
- [x] did any repository get changes since the deprecation notice on 2023-06-08
- [x] send a reminder, similar to #41212legacy Git infrastructure retirement (TPA-RFC-36)anarcatanarcat2024-01-24https://gitlab.torproject.org/tpo/tpa/team/-/issues/41474investigate restore issues following gitlab incident2024-03-19T17:58:59Zanarcatinvestigate restore issues following gitlab incidentIn #41470, we investigated the impact of an [authentication bypass in GitLab](https://about.gitlab.com/releases/2024/01/11/critical-security-release-gitlab-16-7-2-released/#account-takeover-via-password-reset-without-user-interactions) (...In #41470, we investigated the impact of an [authentication bypass in GitLab](https://about.gitlab.com/releases/2024/01/11/critical-security-release-gitlab-16-7-2-released/#account-takeover-via-password-reset-without-user-interactions) ([CVE-2023-7028](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-7028)), during which we had to restore old log files to perform an audit.
Two things became painfully obvious:
1. some logs are missing from old backups
2. some logs create huge files with garbage on restore
Now, the first bit in there might be deliberate: maybe we're excluding old log files to avoid constantly re-indexing the same content.
But the *second* bit is deeply concerning. In https://gitlab.torproject.org/tpo/tpa/team/-/issues/41470#note_2983750, I found that the log file is actually there, but bacula appends a seemingly endless stream of files after it. Truly bizarre, and concerning.anarcatanarcathttps://gitlab.torproject.org/tpo/tpa/team/-/issues/40566Abnormally slow requests on static mirror hosts2024-03-19T00:40:09ZJérôme Charaouilavamind@torproject.orgAbnormally slow requests on static mirror hostsThis morning Nagios was unhappy with some of the static mirror hosts, with several errors like this:
```
tor-nagios: [web-chi-03] network service - https is CRITICAL: CRITICAL - Socket timeout after 10 seconds
tor-nagios: [global] mirro...This morning Nagios was unhappy with some of the static mirror hosts, with several errors like this:
```
tor-nagios: [web-chi-03] network service - https is CRITICAL: CRITICAL - Socket timeout after 10 seconds
tor-nagios: [global] mirror sync - www is CRITICAL: CRITICAL: 38.229.82.25 broken: 500 Cant connect to www.torproject.org:443
```
Looking at Grafana, since about one week ago we are seeing increased loads on our web mirrors, which Apache connection slots getting abnormally filled up:
![Capture_d_écran_de_2021-12-20_12-29-57](/uploads/3ae395cb32e2874722367ab34e28c5c7/Capture_d_écran_de_2021-12-20_12-29-57.png)
Currently Nagios only barks if the web hosts don't respond to HTTPS connections within 10 seconds, which is fine to the purposes of determining whether the service is *alive* at all, but for static sites even on a busy webserver response times of 1 second or more shouldn't be considered acceptable.Jérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41549BTCPayServer is Down2024-03-18T23:29:44ZSusanBTCPayServer is DownI am unable to connect to the btcpay.torproject.org. It says the site cannot be reached. I believe this means that donors cannot use it to donate either.I am unable to connect to the btcpay.torproject.org. It says the site cannot be reached. I believe this means that donors cannot use it to donate either.anarcatanarcathttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41536Draft specs and estimates for new backup storage server2024-03-13T21:04:06ZanarcatDraft specs and estimates for new backup storage server(next) cluster scalingJérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41396Help operations team use the submission server2024-03-13T19:43:23Zmicahmicah@torproject.orgHelp operations team use the submission serverMany folks on the ops/moneymachine teams are having email problems, and while it may not be the only problem, one of the issues is that many of them are not using the tor submission server. They aren't using it often because they don't h...Many folks on the ops/moneymachine teams are having email problems, and while it may not be the only problem, one of the issues is that many of them are not using the tor submission server. They aren't using it often because they don't have a LDAP account in order to set up a submission password.
I've talked with @anarcat and @lavamind about this and @smith has shared with @lavamind a spreadsheet that has each person's individual state.Jérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41364TPA-RFC-63: consider next steps for the backup server (bungei)2024-03-13T13:18:40ZanarcatTPA-RFC-63: consider next steps for the backup server (bungei)bungei filled up this week (#41361) and while we mitigated this by allocating more space to the logical volume, there is now very little space in the volume group to dodge similar bullets in the future ("only" 2.6TB):
```
root@bungei:~#...bungei filled up this week (#41361) and while we mitigated this by allocating more space to the logical volume, there is now very little space in the volume group to dodge similar bullets in the future ("only" 2.6TB):
```
root@bungei:~# vgs
VG #PV #LV #SN Attr VSize VFree
vg_bulk 1 2 0 wz--n- 72.60t 2.60t
```
tasks:
- [x] review past tickets about bungei filling up
- [x] review last years disk stats to see if there's another anomaly
- [x] evaluate costs of a server replacement (see #41536)
- [x] adopt budget for new storage server: https://gitlab.torproject.org/tpo/tpa/team/-/wikis/policy/tpa-rfc-63-storage-server-budget
- [ ] build new storage server (spin out in a different issue?) (#41557)
- [ ] evaluate possible software replacements (see #40950 for PostgreSQL)(next) cluster scalinganarcatanarcathttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41555failed disk on fsn-node-022024-03-12T19:23:34ZJérôme Charaouilavamind@torproject.orgfailed disk on fsn-node-02One of the 10GB HDDs on fsn-node-02 has failed over the weekend. The raid-1 volume below `vg_ganeti_hdd` is thus degraded but otherwise healthy.One of the 10GB HDDs on fsn-node-02 has failed over the weekend. The raid-1 volume below `vg_ganeti_hdd` is thus degraded but otherwise healthy.Jérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41527install python3-sqlparse in polyanthum2024-03-11T12:25:55Zmeskiomeskio@torproject.orginstall python3-sqlparse in polyanthumAfter the upgrade to bookworm two weeks ago onbasca stopped working in polyanthum. It looks like is missing a dependency: python3-sqlparse
Can you install it?
I wonder how was removed by the upgrade.After the upgrade to bookworm two weeks ago onbasca stopped working in polyanthum. It looks like is missing a dependency: python3-sqlparse
Can you install it?
I wonder how was removed by the upgrade.Jérôme Charaouilavamind@torproject.orgJérôme Charaouilavamind@torproject.orghttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41526Deploy onionperf files parser on metricsdb-012024-03-07T14:23:37ZHiroDeploy onionperf files parser on metricsdb-01We need to deploy https://gitlab.torproject.org/tpo/network-health/metrics/tor_fusion/ on metricsdb-01.
Basically this thing will run, download onionperf files from collector and parse them. This will just happen once a day around 1am UT...We need to deploy https://gitlab.torproject.org/tpo/network-health/metrics/tor_fusion/ on metricsdb-01.
Basically this thing will run, download onionperf files from collector and parse them. This will just happen once a day around 1am UTC as at midnight is when collector fetches the archives from the various onionperf clients.
It's a little rust app and was thinking to create a group and user like for the metrics-api. But maybe it's a bit overkill and I should just put it in the parser space?HiroHirohttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41552Grant cohosh developer access to the blog project2024-03-07T02:44:43ZanarcatGrant cohosh developer access to the blog projectFollowing the [instructions on the blog wiki page](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/blog#1-navigate-to-the-gitlab-blog-project-at-httpsgitlabtorprojectorgtpowebblog) led me here :) Do you need me to sign this re...Following the [instructions on the blog wiki page](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/blog#1-navigate-to-the-gitlab-blog-project-at-httpsgitlabtorprojectorgtpowebblog) led me here :) Do you need me to sign this request?anarcatanarcathttps://gitlab.torproject.org/tpo/tpa/team/-/issues/41551Grant cohosh developer access to the blog project2024-03-07T02:44:24ZCecylia BocovichGrant cohosh developer access to the blog projectFollowing the [instructions on the blog wiki page](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/blog#1-navigate-to-the-gitlab-blog-project-at-httpsgitlabtorprojectorgtpowebblog) led me here :) Do you need me to sign this re...Following the [instructions on the blog wiki page](https://gitlab.torproject.org/tpo/tpa/team/-/wikis/service/blog#1-navigate-to-the-gitlab-blog-project-at-httpsgitlabtorprojectorgtpowebblog) led me here :) Do you need me to sign this request?anarcatanarcat