move cluster operations to Reference

The rationale is that this part is more rarely used and definitely more complex than a Howto.

move cluster operations to Reference
7c8ba0f5 · anarcat · e9474980 · 7c8ba0f5
Verified Commit 7c8ba0f5 authored 5 years ago by anarcat
--- a/tsa/howto/ganeti.mdwn
+++ b/tsa/howto/ganeti.mdwn
@@ -11,160 +11,16 @@ another, etc.
 <!-- simple, brainless step-by-step instructions requiring little or -->
 <!-- no technical background -->

-# How-to
-
-## Ganeti Cluster Operations
-
-### Cluster Setup Preliminaries
-
- To create a new box, follow [[new-machine-hetzner-robot]] but change
-  the following settings:
-
-  * Server: [PX62-NVMe](https://www.hetzner.com/dedicated-rootserver/px62-nvme?country=OTHER)
-  * Location: `FSN1`
-  * Operating system: Rescue
-  * Additional drives: 2x10TB
-  * Add in the comment form that the server needs to be in the same
-    datacenter as the other machines (FSN1-DC13, but double-check)
-
- Make sure all nodes have the same LVM setup and the same network setup.  They want openvswitch.  Cf. host `fsn-node-01`'s /etc/network/interfaces.
-
- Prepare all the nodes by configuring them in puppet.  They should be in the class `roles::ganeti::fsn` if they
-  are part of the fsn cluster.  If you make a new cluster, make a new role and add nodes.
-
-Note: we considered experimenting with the new AX line
-([AX51-NVMe](https://www.hetzner.com/dedicated-rootserver/ax51-nvme?country=OTHER)) but in the past DSA had problems live-migrating (it
-wouldn't immediately fail but there were "issues" after). So we might
-need to [failover](http://docs.ganeti.org/ganeti/2.15/man/gnt-instance.html#failover) instead of migrate between those parts of the
-cluster. There are also doubts that the Linux kernel supports those
-shiny new processors at all: similar processors had [trouble booting
-before Linux 5.5](https://www.phoronix.com/scan.php?page=news_item&px=Threadripper-3000-MCE-5.5-Fix) for example, so it might be worth waiting a
-little before switching to that new platform, even if it's
-cheaper. See the cluster configuration section below for a larger
-discussion of CPU emulation.
-
-### New cluster
-
-To create the fsn master, we added fsngnt to DNS, then ran
-
-    gnt-cluster init \
-      --master-netdev vlan-gntbe \
-      --vg-name vg_ganeti \
-      --secondary-ip 172.30.135.1 \
-      --enabled-hypervisors kvm \
-      --nic-parameters link=br0,vlan=4000 \
-      --mac-prefix 00:66:37 \
-      --no-ssh-init \
-      --no-etc-hosts \
-      fsngnt.torproject.org
-
-### Add a new node
-
-We did run the following on fsn-node-01:
-
-    gnt-node add \
-      --secondary-ip 172.30.135.2 \
-      --no-ssh-key-check \
-      --no-node-setup \
-      fsn-node-02.torproject.org
-
-### cluster config
-
-These could probably be merged into the cluster init, but just to document what has been done:
-
-    gnt-cluster modify --reserved-lvs vg_ganeti/root,vg_ganeti/swap
-    gnt-cluster modify -H kvm:kernel_path=,initrd_path=,
-    gnt-cluster modify -H kvm:security_model=pool
-    gnt-cluster modify -H kvm:kvm_extra='-device virtio-rng-pci\,bus=pci.0\,addr=0x1e\,max-bytes=1024\,period=1000'
-    gnt-cluster modify -H kvm:disk_cache=none
-    gnt-cluster modify -H kvm:disk_discard=unmap
-    gnt-cluster modify -H kvm:scsi_controller_type=virtio-scsi-pci
-    gnt-cluster modify -H kvm:disk_type=scsi-hd
-    gnt-cluster modify --uid-pool 4000-4019
-    gnt-cluster modify --nic-parameters mode=openvswitch,link=br0,vlan=4000
-    gnt-cluster modify -D drbd:c-plan-ahead=0,disk-custom='--c-plan-ahead 0'
-    gnt-cluster modify -H kvm:migration_bandwidth=950
-    gnt-cluster modify -H kvm:migration_downtime=500
-
-Note that we might want to tweak the `cpu_type` parameter. By default,
-it emulates a lot of processing that can be delegated to the host CPU
-instead. If we use `kvm:cpu_type=host`, then each node will tailor the
-emulation system to the CPU on the node. But that might make the live
-migration more brittle: VMs or processes can crash after a live
-migrate because of a slightly different configuration (microcode, CPU,
-kernel and QEMU versions all play a role). So we need to find the
-lowest common demoninator in CPU families. The list of available
-families supported by QEMU varies between releases, but is visible
-with:
-
-    # qemu-system-x86_64 -cpu help
-    Available CPUs:
-    x86 486
-    x86 Broadwell             Intel Core Processor (Broadwell)
-    [...]
-    x86 Skylake-Client        Intel Core Processor (Skylake)
-    x86 Skylake-Client-IBRS   Intel Core Processor (Skylake, IBRS)
-    x86 Skylake-Server        Intel Xeon Processor (Skylake)
-    x86 Skylake-Server-IBRS   Intel Xeon Processor (Skylake, IBRS)
-    [...]
-
-The current PX62 line is based on the [Coffee Lake](https://en.wikipedia.org/wiki/Coffee_Lake) Intel
-micro-architecture. The closest matching family would be
-`Skylake-Server` or `Skylake-Server-IBRS`, [according to wikichip](https://en.wikichip.org/wiki/intel/microarchitectures/coffee_lake#Compiler_support).
-Note that newer QEMU releases (4.2, currently in unstable) have more
-supported features.
-
-In that context, of course, supporting different CPU manufacturers
-(say AMD vs Intel) is impractical: they will have totally different
-families that are not compatible with each other. This will break live
-migration, which can trigger crashes and problems in the migrated
-virtual machines.
-
-If there are problems live-migrating between machines, it is still
-possible to "failover" (`gnt-instance failover` instead of `migrate`)
-which shuts off the machine, fails over disks, and starts it on the
-other side. That's not such of a big problem: we often need to reboot
-the guests when we reboot the hosts anyways. But it does complicate
-our work. Of course, it's also possible that live migrates work fine
-if *no* `cpu_type` at all is specified in the cluster, but that needs
-to be verified.
-
-Nodes could also [grouped](http://docs.ganeti.org/ganeti/2.15/man/gnt-group.html) to limit (automated) live migration to a
-subset of nodes.
-
-References:
-
- * <https://dsa.debian.org/howto/install-ganeti/>
- * <https://qemu.weilnetz.de/doc/qemu-doc.html#recommendations_005fcpu_005fmodels_005fx86>
-
-### Network configuration
-
-IP allocation is managed by Ganeti through the `gnt-network(8)`
-system. Say we have `192.0.2.0/24` reserved for the cluster, with
-the host IP `192.0.2.100`` and the gateway on `192.0.2.1`. You will
-create this network with:
-
-    gnt-network add --network 192.0.2.0/24 --gateway 192.0.2.1 --network6 2001:db8::/32 --gateway6 fe80::1 example-network
-
-Then we associate the new network to the default node group:
-
-    gnt-network connect --nic-parameters=link=br0,vlan=4000,mode=openvswitch example-network default
-
-The arguments to `--nic-parameters` come from the values configured in
-the cluster, above. The current values can be found with `gnt-cluster
-info`.
-
-TODO: create a private network.
+## Listing instances

-### Listing instances and nodes

    gnt-instance list
    gnt-node list
    watch -n5 -d 'gnt-instance list -o pnode,name,be/vcpus,be/memory,status,disk_template  |  sort; echo; gnt-node list'

-## Instance Operations
+# How-to

-### Adding a new instance
+## Adding a new instance

 This command creates a new guest, or "instance" in Ganeti's
 vocabulary:
@@ -227,7 +83,7 @@ Also set reverse DNS for both IPv4 and IPv6 in [hetzner's robot](https://robot.y

 Then follow [[new-machine]].

-### Adding and removing addresses on instances
+## Adding and removing addresses on instances

 Say you created an instance but forgot to assign a private IP. You can
 still do so with:
@@ -236,14 +92,14 @@ still do so with:

 TODO: the internal network hasn't been created yet.

-### Destroying an instance
+## Destroying an instance

 This totally deletes the instance, including all mirrors and
 everything, be very careful with it:

    gnt-instance remove test01.torproject.org

-### Accessing serial console
+## Accessing serial console

 Our instances do serial console, starting in grub.  To access it, run

@@ -251,14 +107,14 @@ Our instances do serial console, starting in grub.  To access it, run

 To exit, use `^]` -- that is, Control-&lt;Closing Bracket&gt;.

-### Disk operations (DRBD)
+## Disk operations (DRBD)

 Instances should be setup using the DRBD backend, in which case you
 should probably take a look at [[drbd]] if you have problems with
 that. Ganeti handles most of the logic there so that should generally
 not be necessary.

-### Rebooting
+## Rebooting

 Those hosts need special care, as we can accomplish zero-downtime
 reboots on those machines. There's a script (`ganeti-reboot-cluster`)
@@ -274,7 +130,7 @@ machine, and the cluster might need to be rebalanced. This is
 automatically scheduled by the `ganeti-reboot-cluster` script and will
 be done within 30 minutes of the reboot.

-### Rebalancing a cluster
+## Rebalancing a cluster

 After a reboot or a downtime, all nodes might end up on the same
 machine. This is normally handled by the reboot script, but it might
@@ -324,7 +180,147 @@ cluster. Here's an example run on a small cluster:
 # Reference

 ## Installation
-<!-- how to setup the service from scratch -->
+
+### Cluster Setup Preliminaries
+
+- To create a new box, follow [[new-machine-hetzner-robot]] but change
+  the following settings:
+
+  * Server: [PX62-NVMe](https://www.hetzner.com/dedicated-rootserver/px62-nvme?country=OTHER)
+  * Location: `FSN1`
+  * Operating system: Rescue
+  * Additional drives: 2x10TB
+  * Add in the comment form that the server needs to be in the same
+    datacenter as the other machines (FSN1-DC13, but double-check)
+
+- Make sure all nodes have the same LVM setup and the same network setup.  They want openvswitch.  Cf. host `fsn-node-01`'s /etc/network/interfaces.
+
+- Prepare all the nodes by configuring them in puppet.  They should be in the class `roles::ganeti::fsn` if they
+  are part of the fsn cluster.  If you make a new cluster, make a new role and add nodes.
+
+Note: we considered experimenting with the new AX line
+([AX51-NVMe](https://www.hetzner.com/dedicated-rootserver/ax51-nvme?country=OTHER)) but in the past DSA had problems live-migrating (it
+wouldn't immediately fail but there were "issues" after). So we might
+need to [failover](http://docs.ganeti.org/ganeti/2.15/man/gnt-instance.html#failover) instead of migrate between those parts of the
+cluster. There are also doubts that the Linux kernel supports those
+shiny new processors at all: similar processors had [trouble booting
+before Linux 5.5](https://www.phoronix.com/scan.php?page=news_item&px=Threadripper-3000-MCE-5.5-Fix) for example, so it might be worth waiting a
+little before switching to that new platform, even if it's
+cheaper. See the cluster configuration section below for a larger
+discussion of CPU emulation.
+
+### New cluster
+
+To create the fsn master, we added fsngnt to DNS, then ran
+
+    gnt-cluster init \
+      --master-netdev vlan-gntbe \
+      --vg-name vg_ganeti \
+      --secondary-ip 172.30.135.1 \
+      --enabled-hypervisors kvm \
+      --nic-parameters link=br0,vlan=4000 \
+      --mac-prefix 00:66:37 \
+      --no-ssh-init \
+      --no-etc-hosts \
+      fsngnt.torproject.org
+
+### Add a new node
+
+We did run the following on fsn-node-01:
+
+    gnt-node add \
+      --secondary-ip 172.30.135.2 \
+      --no-ssh-key-check \
+      --no-node-setup \
+      fsn-node-02.torproject.org
+
+### cluster config
+
+These could probably be merged into the cluster init, but just to document what has been done:
+
+    gnt-cluster modify --reserved-lvs vg_ganeti/root,vg_ganeti/swap
+    gnt-cluster modify -H kvm:kernel_path=,initrd_path=,
+    gnt-cluster modify -H kvm:security_model=pool
+    gnt-cluster modify -H kvm:kvm_extra='-device virtio-rng-pci\,bus=pci.0\,addr=0x1e\,max-bytes=1024\,period=1000'
+    gnt-cluster modify -H kvm:disk_cache=none
+    gnt-cluster modify -H kvm:disk_discard=unmap
+    gnt-cluster modify -H kvm:scsi_controller_type=virtio-scsi-pci
+    gnt-cluster modify -H kvm:disk_type=scsi-hd
+    gnt-cluster modify --uid-pool 4000-4019
+    gnt-cluster modify --nic-parameters mode=openvswitch,link=br0,vlan=4000
+    gnt-cluster modify -D drbd:c-plan-ahead=0,disk-custom='--c-plan-ahead 0'
+    gnt-cluster modify -H kvm:migration_bandwidth=950
+    gnt-cluster modify -H kvm:migration_downtime=500
+
+Note that we might want to tweak the `cpu_type` parameter. By default,
+it emulates a lot of processing that can be delegated to the host CPU
+instead. If we use `kvm:cpu_type=host`, then each node will tailor the
+emulation system to the CPU on the node. But that might make the live
+migration more brittle: VMs or processes can crash after a live
+migrate because of a slightly different configuration (microcode, CPU,
+kernel and QEMU versions all play a role). So we need to find the
+lowest common demoninator in CPU families. The list of available
+families supported by QEMU varies between releases, but is visible
+with:
+
+    # qemu-system-x86_64 -cpu help
+    Available CPUs:
+    x86 486
+    x86 Broadwell             Intel Core Processor (Broadwell)
+    [...]
+    x86 Skylake-Client        Intel Core Processor (Skylake)
+    x86 Skylake-Client-IBRS   Intel Core Processor (Skylake, IBRS)
+    x86 Skylake-Server        Intel Xeon Processor (Skylake)
+    x86 Skylake-Server-IBRS   Intel Xeon Processor (Skylake, IBRS)
+    [...]
+
+The current PX62 line is based on the [Coffee Lake](https://en.wikipedia.org/wiki/Coffee_Lake) Intel
+micro-architecture. The closest matching family would be
+`Skylake-Server` or `Skylake-Server-IBRS`, [according to wikichip](https://en.wikichip.org/wiki/intel/microarchitectures/coffee_lake#Compiler_support).
+Note that newer QEMU releases (4.2, currently in unstable) have more
+supported features.
+
+In that context, of course, supporting different CPU manufacturers
+(say AMD vs Intel) is impractical: they will have totally different
+families that are not compatible with each other. This will break live
+migration, which can trigger crashes and problems in the migrated
+virtual machines.
+
+If there are problems live-migrating between machines, it is still
+possible to "failover" (`gnt-instance failover` instead of `migrate`)
+which shuts off the machine, fails over disks, and starts it on the
+other side. That's not such of a big problem: we often need to reboot
+the guests when we reboot the hosts anyways. But it does complicate
+our work. Of course, it's also possible that live migrates work fine
+if *no* `cpu_type` at all is specified in the cluster, but that needs
+to be verified.
+
+Nodes could also [grouped](http://docs.ganeti.org/ganeti/2.15/man/gnt-group.html) to limit (automated) live migration to a
+subset of nodes.
+
+References:
+
+ * <https://dsa.debian.org/howto/install-ganeti/>
+ * <https://qemu.weilnetz.de/doc/qemu-doc.html#recommendations_005fcpu_005fmodels_005fx86>
+
+### Network configuration
+
+IP allocation is managed by Ganeti through the `gnt-network(8)`
+system. Say we have `192.0.2.0/24` reserved for the cluster, with
+the host IP `192.0.2.100`` and the gateway on `192.0.2.1`. You will
+create this network with:
+
+    gnt-network add --network 192.0.2.0/24 --gateway 192.0.2.1 --network6 2001:db8::/32 --gateway6 fe80::1 example-network
+
+Then we associate the new network to the default node group:
+
+    gnt-network connect --nic-parameters=link=br0,vlan=4000,mode=openvswitch example-network default
+
+The arguments to `--nic-parameters` come from the values configured in
+the cluster, above. The current values can be found with `gnt-cluster
+info`.
+
+TODO: create a private network.

 ## SLA
 <!-- this describes an acceptable level of service for this service -->