No need for a specific announcement: this is a follow-up of an already-done issue (#41512 (closed)) and we checked with @hiro on IRC that it's fine to proceed.
zenmarked the checklist item announcement as completed
marked the checklist item announcement as completed
starting tasks at 2024-11-07 19:23:58.058028+00:00
checking on fsn-node-01.torproject.org if instance onionoo-frontend-01.torproject.org is running
stopping instance onionoo-frontend-01.torproject.org on fsn-node-01.torproject.org
Waiting for job 1015576 for onionoo-frontend-01.torproject.org ...
scheduling onionoo-frontend-01.torproject.org instance removal on host fsn-node-01.torproject.org
scheduling gnt-instance remove --force onionoo-frontend-01.torproject.org to run on fsn-node-01.torproject.org in 7
syntax error. Last token seen: 7
Garbled time
scheduling onionoo-frontend-01.torproject.org backup disks removal on host bungei.torproject.org and director bacula-director-01.torproject.org
checking for path "/srv/backups/bacula/onionoo-frontend-01.torproject.org/" on bungei.torproject.org
scheduling rm -rf "/srv/backups/bacula/onionoo-frontend-01.torproject.org/" to run on bungei.torproject.org in 30 days
warning: commands will be executed using /bin/sh
job 130 at Sat Dec 7 19:24:00 2024
checking for path "/srv/backups/pg/onionoo-frontend-01/" on bungei.torproject.org
path /srv/backups/pg/onionoo-frontend-01/ not found: [Errno 2] No such file
scheduling echo delete client=onionoo-frontend-01.torproject.org-fd yes | bconsole to run on bacula-director-01.torproject.org in 30 days
warning: commands will be executed using /bin/sh
job 71 at Sat Dec 7 19:24:00 2024
Notice: Revoked certificate with serial 90
Notice: Removing file Puppet::SSL::Certificate onionoo-frontend-01.torproject.org at '/var/lib/puppet/ssl/ca/signed/onionoo-frontend-01.torproject.org.pem'
onionoo-frontend-01.torproject.org
Submitted 'deactivate node' for onionoo-frontend-01.torproject.org with UUID 153d551f-df7b-4904-99ee-70f58cc6a761
completed tasks, elasped: 0:00:44.453662 (user 4.06 system 0.16 chlduser 0.0 chldsystem 0.0 RSS 67.8 MB)
onionoo-frontend-02.torproject.org
starting tasks at 2024-11-07 20:03:26.995846+00:00
checking on fsn-node-01.torproject.org if instance onionoo-frontend-02.torproject.org is running
stopping instance onionoo-frontend-02.torproject.org on fsn-node-01.torproject.org
Waiting for job 1015585 for onionoo-frontend-02.torproject.org ...
scheduling onionoo-frontend-02.torproject.org instance removal on host fsn-node-01.torproject.org
scheduling gnt-instance remove --force onionoo-frontend-02.torproject.org to run on fsn-node-01.torproject.org in 7 days
warning: commands will be executed using /bin/sh
job 36 at Thu Nov 14 20:03:00 2024
scheduling onionoo-frontend-02.torproject.org backup disks removal on host bungei.torproject.org and director bacula-director-01.torproject.org
checking for path "/srv/backups/bacula/onionoo-frontend-02.torproject.org/" on bungei.torproject.org
scheduling rm -rf "/srv/backups/bacula/onionoo-frontend-02.torproject.org/" to run on bungei.torproject.org in 30 days days
syntax error. Last token seen: days
Garbled time
checking for path "/srv/backups/pg/onionoo-frontend-02/" on bungei.torproject.org
path /srv/backups/pg/onionoo-frontend-02/ not found: [Errno 2] No such file
scheduling echo delete client=onionoo-frontend-02.torproject.org-fd yes | bconsole to run on bacula-director-01.torproject.org in 30 days days
syntax error. Last token seen: days
Garbled time
Notice: Revoked certificate with serial 110
Notice: Removing file Puppet::SSL::Certificate onionoo-frontend-02.torproject.org at '/var/lib/puppet/ssl/ca/signed/onionoo-frontend-02.torproject.org.pem'
onionoo-frontend-02.torproject.org
Submitted 'deactivate node' for onionoo-frontend-02.torproject.org with UUID fe4ab4a7-66da-4686-bcc1-0debd13a8d18
completed tasks, elasped: 0:00:41.268267 (user 3.94 system 0.18 chlduser 0.0 chldsystem 0.0 RSS 67.9 MB)
onionoo-backend-01.torproject.org
starting tasks at 2024-11-07 20:10:30.916818+00:00
checking on fsn-node-01.torproject.org if instance onionoo-backend-01.torproject.org is running
stopping instance onionoo-backend-01.torproject.org on fsn-node-01.torproject.org
Waiting for job 1015588 for onionoo-backend-01.torproject.org ...
scheduling onionoo-backend-01.torproject.org instance removal on host fsn-node-01.torproject.org
scheduling gnt-instance remove --force onionoo-backend-01.torproject.org to run on fsn-node-01.torproject.org in 7 days
warning: commands will be executed using /bin/sh
job 39 at Thu Nov 14 20:10:00 2024
scheduling onionoo-backend-01.torproject.org backup disks removal on host bungei.torproject.org and director bacula-director-01.torproject.org
checking for path "/srv/backups/bacula/onionoo-backend-01.torproject.org/" on bungei.torproject.org
scheduling rm -rf "/srv/backups/bacula/onionoo-backend-01.torproject.org/" to run on bungei.torproject.org in 30 days
warning: commands will be executed using /bin/sh
job 131 at Sat Dec 7 20:11:00 2024
checking for path "/srv/backups/pg/onionoo-backend-01/" on bungei.torproject.org
path /srv/backups/pg/onionoo-backend-01/ not found: [Errno 2] No such file
scheduling echo delete client=onionoo-backend-01.torproject.org-fd yes | bconsole to run on bacula-director-01.torproject.org in 30 days
warning: commands will be executed using /bin/sh
job 72 at Sat Dec 7 20:11:00 2024
Notice: Revoked certificate with serial 83
Notice: Removing file Puppet::SSL::Certificate onionoo-backend-01.torproject.org at '/var/lib/puppet/ssl/ca/signed/onionoo-backend-01.torproject.org.pem'
onionoo-backend-01.torproject.org
Submitted 'deactivate node' for onionoo-backend-01.torproject.org with UUID 73995993-6acf-4948-bf73-2404936195a1
completed tasks, elasped: 0:00:50.830213 (user 4.19 system 0.21 chlduser 0.0 chldsystem 0.0 RSS 70.1 MB)
onionoo-backend-02.torproject.org
starting tasks at 2024-11-07 20:48:51.520137+00:00
checking on fsn-node-01.torproject.org if instance onionoo-backend-02.torproject.org is running
stopping instance onionoo-backend-02.torproject.org on fsn-node-01.torproject.org
Waiting for job 1015597 for onionoo-backend-02.torproject.org ...
scheduling onionoo-backend-02.torproject.org instance removal on host fsn-node-01.torproject.org
scheduling gnt-instance remove --force onionoo-backend-02.torproject.org to run on fsn-node-01.torproject.org in 7 days
warning: commands will be executed using /bin/sh
job 41 at Thu Nov 14 20:49:00 2024
scheduling onionoo-backend-02.torproject.org backup disks removal on host bungei.torproject.org and director bacula-director-01.torproject.org
checking for path "/srv/backups/bacula/onionoo-backend-02.torproject.org/" on bungei.torproject.org
scheduling rm -rf "/srv/backups/bacula/onionoo-backend-02.torproject.org/" to run on bungei.torproject.org in 30 days
warning: commands will be executed using /bin/sh
job 134 at Sat Dec 7 20:49:00 2024
checking for path "/srv/backups/pg/onionoo-backend-02/" on bungei.torproject.org
path /srv/backups/pg/onionoo-backend-02/ not found: [Errno 2] No such file
scheduling echo delete client=onionoo-backend-02.torproject.org-fd yes | bconsole to run on bacula-director-01.torproject.org in 30 days
warning: commands will be executed using /bin/sh
job 74 at Sat Dec 7 20:49:00 2024
Notice: Revoked certificate with serial 97
Notice: Removing file Puppet::SSL::Certificate onionoo-backend-02.torproject.org at '/var/lib/puppet/ssl/ca/signed/onionoo-backend-02.torproject.org.pem'
onionoo-backend-02.torproject.org
Submitted 'deactivate node' for onionoo-backend-02.torproject.org with UUID 700f666c-8c33-443f-ab84-660dafa22b7f
completed tasks, elasped: 0:00:50.267270 (user 4.02 system 0.18 chlduser 0.0 chldsystem 0.0 RSS 70.0 MB)
Notes:
During the run for onionoo-frontend-01, the scheduling gnt-instance remove task failed because of fabric-tasks@bc1021f4. So I tried to run it manually in the ganeti node, but unfortunately made a mistake in the command line and prematurely removed it.
During the run for onionoo-frontend-02, while I was trying to fix the scheduling issue above, the scheduling of removal of backup files and backup client from bacula failed, so I ran them manually (with oversight).
The other two nodes worked well.
zenmarked the checklist item retire the host in fabric as completed
marked the checklist item retire the host in fabric as completed
zenmarked the checklist item remove from LDAP with ldapvi as completed
marked the checklist item remove from LDAP with ldapvi as completed
we had an alert about those retirements in prometheus. it's not your fault: i think we overlooked the new monitoring system in our retirement procedures, as the puppet node retirement doesn't cleanup the reports generated by the current exporter. i've documented this in the pager playbook and, for now, patched the retirement procedure to add a silence instead.
@lelutin i wonder if the retirement procedure should instead cleanup the old reports, so that we don't have an alert at all... do you know how we could do that? would also welcome your review of the docs i did on this in wiki-replica@2a9d68ce
I see the following mentions to onionoo in our repos which I don't think need change (rationale: changing would be either removing historic info or updating command examples, which would be a difficult practice to maintain):
status-site: historic info
tor-nagios: deprecated repo
wiki: historic info and command output examples
fabric-tasks: command example
tor-puppet: command and output examples
So at this point I don't see anything that needs cleanup. Please let me know if you find something that I overlooked.