remove duplicate KVM and Ganeti reboot documentation authored by anarcat's avatar anarcat
...@@ -147,49 +147,28 @@ defined to `justdoit` or `rotation`: ...@@ -147,49 +147,28 @@ defined to `justdoit` or `rotation`:
echo "rebooting 'rotation' hosts with a 30-minute delay...." echo "rebooting 'rotation' hosts with a 30-minute delay...."
./reboot -H $(ssh alberti.torproject.org 'ldapsearch -h db.torproject.org -x -ZZ -b ou=hosts,dc=torproject,dc=org -LLL "(rebootPolicy=rotation)" hostname | awk "\$1 == \"hostname:\" {print \$2}" | sort -R') --delay-hosts=1800 --delay-shutdown=10 -v ./reboot -H $(ssh alberti.torproject.org 'ldapsearch -h db.torproject.org -x -ZZ -b ou=hosts,dc=torproject,dc=org -LLL "(rebootPolicy=rotation)" hostname | awk "\$1 == \"hostname:\" {print \$2}" | sort -R') --delay-hosts=1800 --delay-shutdown=10 -v
### Rebooting KVM hosts
The remaining is the "manual" procedure, the KVM hosts: The remaining is the "manual" procedure, the KVM hosts:
./reboot-host moly.torproject.org ./reboot-host moly.torproject.org
### Rebooting Ganeti nodes
The ganeti hosts, using Fabric: The ganeti hosts, using Fabric:
./reboot -v --delay-shutdown 1 --delay-hosts 30 -H fsn-node-0{1,2,3,4,5}.torproject.org ./reboot -v --delay-shutdown 1 --delay-hosts 30 -H fsn-node-0{1,2,3,4,5}.torproject.org
The scaleway box needs special handholding, see [ticket 32920](https://bugs.torproject.org/32920). The This is also documented in the [howto/ganeti](howto/ganeti) section. Do not
windows boxes should normally not need a reboot. forget to rebalance the cluster after the reboot.
All hosts should be rebooted now, see [Nagios unhandled problems](https://nagios.torproject.org/cgi-bin/icinga/status.cgi?allunhandledproblems)
to confirm.
#### Rebooting KVM hosts
Generally, KVM hosts are the latter case and need special attention, ### Remaining nodes
as the guests need to be individually rebooted. The
`tor-libvirt-reboot` takes care of the hand-holding necessary
here. When the server returns, the encrypted partitions need to be
unlocked as well, with the `tor-libvirt-luks-start` command. A full
reboot procedure will look something like this:
HOST=unifolium.torproject.org The scaleway box needs special handholding, see [ticket 32920](https://bugs.torproject.org/32920). The
echo "showing motd to see affected guests" && windows boxes should normally not need a reboot.
ssh $HOST cat /etc/motd &&
ssh -tt root@$HOST tor-libvirt-reboot ; \
echo "waiting 30 seconds for host to go down..." &&
sleep 30 &&
echo "waiting up to 2 minutes for $HOST to come back" &&
ping -c 10 -w 120 $HOST ; \
ssh -tt root@$HOST tor-libvirt-luks-start
(Update: the above script is now in `tsa-misc/reboot-host`.)
If only the guests on the machine need a reboot, for example Nagios
complains about `libvirt-qemu` processes, use the
`tor-libvirt-stop-start` script.
#### Rebooting Ganeti clusters
This is documented in the [howto/ganeti](howto/ganeti) section, but it's basically When all hosts are rebooted, see [Nagios unhandled problems](https://nagios.torproject.org/cgi-bin/icinga/status.cgi?allunhandledproblems) to
running the above `reboot` sript and `hbal` commands. confirm.
#### Generic upgrade routines #### Generic upgrade routines
... ...
......