remove duplicate KVM and Ganeti reboot documentation authored by anarcat's avatar anarcat
......@@ -147,49 +147,28 @@ defined to `justdoit` or `rotation`:
echo "rebooting 'rotation' hosts with a 30-minute delay...."
./reboot -H $(ssh alberti.torproject.org 'ldapsearch -h db.torproject.org -x -ZZ -b ou=hosts,dc=torproject,dc=org -LLL "(rebootPolicy=rotation)" hostname | awk "\$1 == \"hostname:\" {print \$2}" | sort -R') --delay-hosts=1800 --delay-shutdown=10 -v
### Rebooting KVM hosts
The remaining is the "manual" procedure, the KVM hosts:
./reboot-host moly.torproject.org
### Rebooting Ganeti nodes
The ganeti hosts, using Fabric:
./reboot -v --delay-shutdown 1 --delay-hosts 30 -H fsn-node-0{1,2,3,4,5}.torproject.org
The scaleway box needs special handholding, see [ticket 32920](https://bugs.torproject.org/32920). The
windows boxes should normally not need a reboot.
All hosts should be rebooted now, see [Nagios unhandled problems](https://nagios.torproject.org/cgi-bin/icinga/status.cgi?allunhandledproblems)
to confirm.
#### Rebooting KVM hosts
This is also documented in the [howto/ganeti](howto/ganeti) section. Do not
forget to rebalance the cluster after the reboot.
Generally, KVM hosts are the latter case and need special attention,
as the guests need to be individually rebooted. The
`tor-libvirt-reboot` takes care of the hand-holding necessary
here. When the server returns, the encrypted partitions need to be
unlocked as well, with the `tor-libvirt-luks-start` command. A full
reboot procedure will look something like this:
### Remaining nodes
HOST=unifolium.torproject.org
echo "showing motd to see affected guests" &&
ssh $HOST cat /etc/motd &&
ssh -tt root@$HOST tor-libvirt-reboot ; \
echo "waiting 30 seconds for host to go down..." &&
sleep 30 &&
echo "waiting up to 2 minutes for $HOST to come back" &&
ping -c 10 -w 120 $HOST ; \
ssh -tt root@$HOST tor-libvirt-luks-start
(Update: the above script is now in `tsa-misc/reboot-host`.)
If only the guests on the machine need a reboot, for example Nagios
complains about `libvirt-qemu` processes, use the
`tor-libvirt-stop-start` script.
#### Rebooting Ganeti clusters
The scaleway box needs special handholding, see [ticket 32920](https://bugs.torproject.org/32920). The
windows boxes should normally not need a reboot.
This is documented in the [howto/ganeti](howto/ganeti) section, but it's basically
running the above `reboot` sript and `hbal` commands.
When all hosts are rebooted, see [Nagios unhandled problems](https://nagios.torproject.org/cgi-bin/icinga/status.cgi?allunhandledproblems) to
confirm.
#### Generic upgrade routines
......
......