Changes
Page history
remove duplicate KVM and Ganeti reboot documentation
authored
Jan 13, 2021
by
anarcat
Show whitespace changes
Inline
Side-by-side
howto/upgrades.md
View page @
6f49bd94
...
...
@@ -147,49 +147,28 @@ defined to `justdoit` or `rotation`:
echo "rebooting 'rotation' hosts with a 30-minute delay...."
./reboot -H $(ssh alberti.torproject.org 'ldapsearch -h db.torproject.org -x -ZZ -b ou=hosts,dc=torproject,dc=org -LLL "(rebootPolicy=rotation)" hostname | awk "\$1 == \"hostname:\" {print \$2}" | sort -R') --delay-hosts=1800 --delay-shutdown=10 -v
### Rebooting KVM hosts
The remaining is the "manual" procedure, the KVM hosts:
./reboot-host moly.torproject.org
### Rebooting Ganeti nodes
The ganeti hosts, using Fabric:
./reboot -v --delay-shutdown 1 --delay-hosts 30 -H fsn-node-0{1,2,3,4,5}.torproject.org
The scaleway box needs special handholding, see
[
ticket 32920
](
https://bugs.torproject.org/32920
)
. The
windows boxes should normally not need a reboot.
All hosts should be rebooted now, see
[
Nagios unhandled problems
](
https://nagios.torproject.org/cgi-bin/icinga/status.cgi?allunhandledproblems
)
to confirm.
#### Rebooting KVM hosts
This is also documented in the
[
howto/ganeti
](
howto/ganeti
)
section. Do not
forget to rebalance the cluster after the reboot.
Generally, KVM hosts are the latter case and need special attention,
as the guests need to be individually rebooted. The
`tor-libvirt-reboot`
takes care of the hand-holding necessary
here. When the server returns, the encrypted partitions need to be
unlocked as well, with the
`tor-libvirt-luks-start`
command. A full
reboot procedure will look something like this:
### Remaining nodes
HOST=unifolium.torproject.org
echo "showing motd to see affected guests" &&
ssh $HOST cat /etc/motd &&
ssh -tt root@$HOST tor-libvirt-reboot ; \
echo "waiting 30 seconds for host to go down..." &&
sleep 30 &&
echo "waiting up to 2 minutes for $HOST to come back" &&
ping -c 10 -w 120 $HOST ; \
ssh -tt root@$HOST tor-libvirt-luks-start
(Update: the above script is now in
`tsa-misc/reboot-host`
.)
If only the guests on the machine need a reboot, for example Nagios
complains about
`libvirt-qemu`
processes, use the
`tor-libvirt-stop-start`
script.
#### Rebooting Ganeti clusters
The scaleway box needs special handholding, see
[
ticket 32920
](
https://bugs.torproject.org/32920
)
. The
windows boxes should normally not need a reboot.
This is documented in the
[
howto/ganeti
](
howto/ganeti
)
section, but it's basically
running the above
`reboot`
sript and
`hbal`
commands
.
When all hosts are rebooted, see
[
Nagios unhandled problems
](
https://nagios.torproject.org/cgi-bin/icinga/status.cgi?allunhandledproblems
)
to
confirm
.
#### Generic upgrade routines
...
...
...
...