Skip to content
Snippets Groups Projects
Verified Commit 7c8ba0f5 authored by anarcat's avatar anarcat
Browse files

move cluster operations to Reference

The rationale is that this part is more rarely used and definitely
more complex than a Howto.
parent e9474980
No related branches found
No related tags found
No related merge requests found
......@@ -11,160 +11,16 @@ another, etc.
<!-- simple, brainless step-by-step instructions requiring little or -->
<!-- no technical background -->
# How-to
## Ganeti Cluster Operations
### Cluster Setup Preliminaries
- To create a new box, follow [[new-machine-hetzner-robot]] but change
the following settings:
* Server: [PX62-NVMe](https://www.hetzner.com/dedicated-rootserver/px62-nvme?country=OTHER)
* Location: `FSN1`
* Operating system: Rescue
* Additional drives: 2x10TB
* Add in the comment form that the server needs to be in the same
datacenter as the other machines (FSN1-DC13, but double-check)
- Make sure all nodes have the same LVM setup and the same network setup. They want openvswitch. Cf. host `fsn-node-01`'s /etc/network/interfaces.
- Prepare all the nodes by configuring them in puppet. They should be in the class `roles::ganeti::fsn` if they
are part of the fsn cluster. If you make a new cluster, make a new role and add nodes.
Note: we considered experimenting with the new AX line
([AX51-NVMe](https://www.hetzner.com/dedicated-rootserver/ax51-nvme?country=OTHER)) but in the past DSA had problems live-migrating (it
wouldn't immediately fail but there were "issues" after). So we might
need to [failover](http://docs.ganeti.org/ganeti/2.15/man/gnt-instance.html#failover) instead of migrate between those parts of the
cluster. There are also doubts that the Linux kernel supports those
shiny new processors at all: similar processors had [trouble booting
before Linux 5.5](https://www.phoronix.com/scan.php?page=news_item&px=Threadripper-3000-MCE-5.5-Fix) for example, so it might be worth waiting a
little before switching to that new platform, even if it's
cheaper. See the cluster configuration section below for a larger
discussion of CPU emulation.
### New cluster
To create the fsn master, we added fsngnt to DNS, then ran
gnt-cluster init \
--master-netdev vlan-gntbe \
--vg-name vg_ganeti \
--secondary-ip 172.30.135.1 \
--enabled-hypervisors kvm \
--nic-parameters link=br0,vlan=4000 \
--mac-prefix 00:66:37 \
--no-ssh-init \
--no-etc-hosts \
fsngnt.torproject.org
### Add a new node
We did run the following on fsn-node-01:
gnt-node add \
--secondary-ip 172.30.135.2 \
--no-ssh-key-check \
--no-node-setup \
fsn-node-02.torproject.org
### cluster config
These could probably be merged into the cluster init, but just to document what has been done:
gnt-cluster modify --reserved-lvs vg_ganeti/root,vg_ganeti/swap
gnt-cluster modify -H kvm:kernel_path=,initrd_path=,
gnt-cluster modify -H kvm:security_model=pool
gnt-cluster modify -H kvm:kvm_extra='-device virtio-rng-pci\,bus=pci.0\,addr=0x1e\,max-bytes=1024\,period=1000'
gnt-cluster modify -H kvm:disk_cache=none
gnt-cluster modify -H kvm:disk_discard=unmap
gnt-cluster modify -H kvm:scsi_controller_type=virtio-scsi-pci
gnt-cluster modify -H kvm:disk_type=scsi-hd
gnt-cluster modify --uid-pool 4000-4019
gnt-cluster modify --nic-parameters mode=openvswitch,link=br0,vlan=4000
gnt-cluster modify -D drbd:c-plan-ahead=0,disk-custom='--c-plan-ahead 0'
gnt-cluster modify -H kvm:migration_bandwidth=950
gnt-cluster modify -H kvm:migration_downtime=500
Note that we might want to tweak the `cpu_type` parameter. By default,
it emulates a lot of processing that can be delegated to the host CPU
instead. If we use `kvm:cpu_type=host`, then each node will tailor the
emulation system to the CPU on the node. But that might make the live
migration more brittle: VMs or processes can crash after a live
migrate because of a slightly different configuration (microcode, CPU,
kernel and QEMU versions all play a role). So we need to find the
lowest common demoninator in CPU families. The list of available
families supported by QEMU varies between releases, but is visible
with:
# qemu-system-x86_64 -cpu help
Available CPUs:
x86 486
x86 Broadwell Intel Core Processor (Broadwell)
[...]
x86 Skylake-Client Intel Core Processor (Skylake)
x86 Skylake-Client-IBRS Intel Core Processor (Skylake, IBRS)
x86 Skylake-Server Intel Xeon Processor (Skylake)
x86 Skylake-Server-IBRS Intel Xeon Processor (Skylake, IBRS)
[...]
The current PX62 line is based on the [Coffee Lake](https://en.wikipedia.org/wiki/Coffee_Lake) Intel
micro-architecture. The closest matching family would be
`Skylake-Server` or `Skylake-Server-IBRS`, [according to wikichip](https://en.wikichip.org/wiki/intel/microarchitectures/coffee_lake#Compiler_support).
Note that newer QEMU releases (4.2, currently in unstable) have more
supported features.
In that context, of course, supporting different CPU manufacturers
(say AMD vs Intel) is impractical: they will have totally different
families that are not compatible with each other. This will break live
migration, which can trigger crashes and problems in the migrated
virtual machines.
If there are problems live-migrating between machines, it is still
possible to "failover" (`gnt-instance failover` instead of `migrate`)
which shuts off the machine, fails over disks, and starts it on the
other side. That's not such of a big problem: we often need to reboot
the guests when we reboot the hosts anyways. But it does complicate
our work. Of course, it's also possible that live migrates work fine
if *no* `cpu_type` at all is specified in the cluster, but that needs
to be verified.
Nodes could also [grouped](http://docs.ganeti.org/ganeti/2.15/man/gnt-group.html) to limit (automated) live migration to a
subset of nodes.
References:
* <https://dsa.debian.org/howto/install-ganeti/>
* <https://qemu.weilnetz.de/doc/qemu-doc.html#recommendations_005fcpu_005fmodels_005fx86>
### Network configuration
IP allocation is managed by Ganeti through the `gnt-network(8)`
system. Say we have `192.0.2.0/24` reserved for the cluster, with
the host IP `192.0.2.100`` and the gateway on `192.0.2.1`. You will
create this network with:
gnt-network add --network 192.0.2.0/24 --gateway 192.0.2.1 --network6 2001:db8::/32 --gateway6 fe80::1 example-network
Then we associate the new network to the default node group:
gnt-network connect --nic-parameters=link=br0,vlan=4000,mode=openvswitch example-network default
The arguments to `--nic-parameters` come from the values configured in
the cluster, above. The current values can be found with `gnt-cluster
info`.
TODO: create a private network.
## Listing instances
### Listing instances and nodes
gnt-instance list
gnt-node list
watch -n5 -d 'gnt-instance list -o pnode,name,be/vcpus,be/memory,status,disk_template | sort; echo; gnt-node list'
## Instance Operations
# How-to
### Adding a new instance
## Adding a new instance
This command creates a new guest, or "instance" in Ganeti's
vocabulary:
......@@ -227,7 +83,7 @@ Also set reverse DNS for both IPv4 and IPv6 in [hetzner's robot](https://robot.y
Then follow [[new-machine]].
### Adding and removing addresses on instances
## Adding and removing addresses on instances
Say you created an instance but forgot to assign a private IP. You can
still do so with:
......@@ -236,14 +92,14 @@ still do so with:
TODO: the internal network hasn't been created yet.
### Destroying an instance
## Destroying an instance
This totally deletes the instance, including all mirrors and
everything, be very careful with it:
gnt-instance remove test01.torproject.org
### Accessing serial console
## Accessing serial console
Our instances do serial console, starting in grub. To access it, run
......@@ -251,14 +107,14 @@ Our instances do serial console, starting in grub. To access it, run
To exit, use `^]` -- that is, Control-&lt;Closing Bracket&gt;.
### Disk operations (DRBD)
## Disk operations (DRBD)
Instances should be setup using the DRBD backend, in which case you
should probably take a look at [[drbd]] if you have problems with
that. Ganeti handles most of the logic there so that should generally
not be necessary.
### Rebooting
## Rebooting
Those hosts need special care, as we can accomplish zero-downtime
reboots on those machines. There's a script (`ganeti-reboot-cluster`)
......@@ -274,7 +130,7 @@ machine, and the cluster might need to be rebalanced. This is
automatically scheduled by the `ganeti-reboot-cluster` script and will
be done within 30 minutes of the reboot.
### Rebalancing a cluster
## Rebalancing a cluster
After a reboot or a downtime, all nodes might end up on the same
machine. This is normally handled by the reboot script, but it might
......@@ -324,7 +180,147 @@ cluster. Here's an example run on a small cluster:
# Reference
## Installation
<!-- how to setup the service from scratch -->
### Cluster Setup Preliminaries
- To create a new box, follow [[new-machine-hetzner-robot]] but change
the following settings:
* Server: [PX62-NVMe](https://www.hetzner.com/dedicated-rootserver/px62-nvme?country=OTHER)
* Location: `FSN1`
* Operating system: Rescue
* Additional drives: 2x10TB
* Add in the comment form that the server needs to be in the same
datacenter as the other machines (FSN1-DC13, but double-check)
- Make sure all nodes have the same LVM setup and the same network setup. They want openvswitch. Cf. host `fsn-node-01`'s /etc/network/interfaces.
- Prepare all the nodes by configuring them in puppet. They should be in the class `roles::ganeti::fsn` if they
are part of the fsn cluster. If you make a new cluster, make a new role and add nodes.
Note: we considered experimenting with the new AX line
([AX51-NVMe](https://www.hetzner.com/dedicated-rootserver/ax51-nvme?country=OTHER)) but in the past DSA had problems live-migrating (it
wouldn't immediately fail but there were "issues" after). So we might
need to [failover](http://docs.ganeti.org/ganeti/2.15/man/gnt-instance.html#failover) instead of migrate between those parts of the
cluster. There are also doubts that the Linux kernel supports those
shiny new processors at all: similar processors had [trouble booting
before Linux 5.5](https://www.phoronix.com/scan.php?page=news_item&px=Threadripper-3000-MCE-5.5-Fix) for example, so it might be worth waiting a
little before switching to that new platform, even if it's
cheaper. See the cluster configuration section below for a larger
discussion of CPU emulation.
### New cluster
To create the fsn master, we added fsngnt to DNS, then ran
gnt-cluster init \
--master-netdev vlan-gntbe \
--vg-name vg_ganeti \
--secondary-ip 172.30.135.1 \
--enabled-hypervisors kvm \
--nic-parameters link=br0,vlan=4000 \
--mac-prefix 00:66:37 \
--no-ssh-init \
--no-etc-hosts \
fsngnt.torproject.org
### Add a new node
We did run the following on fsn-node-01:
gnt-node add \
--secondary-ip 172.30.135.2 \
--no-ssh-key-check \
--no-node-setup \
fsn-node-02.torproject.org
### cluster config
These could probably be merged into the cluster init, but just to document what has been done:
gnt-cluster modify --reserved-lvs vg_ganeti/root,vg_ganeti/swap
gnt-cluster modify -H kvm:kernel_path=,initrd_path=,
gnt-cluster modify -H kvm:security_model=pool
gnt-cluster modify -H kvm:kvm_extra='-device virtio-rng-pci\,bus=pci.0\,addr=0x1e\,max-bytes=1024\,period=1000'
gnt-cluster modify -H kvm:disk_cache=none
gnt-cluster modify -H kvm:disk_discard=unmap
gnt-cluster modify -H kvm:scsi_controller_type=virtio-scsi-pci
gnt-cluster modify -H kvm:disk_type=scsi-hd
gnt-cluster modify --uid-pool 4000-4019
gnt-cluster modify --nic-parameters mode=openvswitch,link=br0,vlan=4000
gnt-cluster modify -D drbd:c-plan-ahead=0,disk-custom='--c-plan-ahead 0'
gnt-cluster modify -H kvm:migration_bandwidth=950
gnt-cluster modify -H kvm:migration_downtime=500
Note that we might want to tweak the `cpu_type` parameter. By default,
it emulates a lot of processing that can be delegated to the host CPU
instead. If we use `kvm:cpu_type=host`, then each node will tailor the
emulation system to the CPU on the node. But that might make the live
migration more brittle: VMs or processes can crash after a live
migrate because of a slightly different configuration (microcode, CPU,
kernel and QEMU versions all play a role). So we need to find the
lowest common demoninator in CPU families. The list of available
families supported by QEMU varies between releases, but is visible
with:
# qemu-system-x86_64 -cpu help
Available CPUs:
x86 486
x86 Broadwell Intel Core Processor (Broadwell)
[...]
x86 Skylake-Client Intel Core Processor (Skylake)
x86 Skylake-Client-IBRS Intel Core Processor (Skylake, IBRS)
x86 Skylake-Server Intel Xeon Processor (Skylake)
x86 Skylake-Server-IBRS Intel Xeon Processor (Skylake, IBRS)
[...]
The current PX62 line is based on the [Coffee Lake](https://en.wikipedia.org/wiki/Coffee_Lake) Intel
micro-architecture. The closest matching family would be
`Skylake-Server` or `Skylake-Server-IBRS`, [according to wikichip](https://en.wikichip.org/wiki/intel/microarchitectures/coffee_lake#Compiler_support).
Note that newer QEMU releases (4.2, currently in unstable) have more
supported features.
In that context, of course, supporting different CPU manufacturers
(say AMD vs Intel) is impractical: they will have totally different
families that are not compatible with each other. This will break live
migration, which can trigger crashes and problems in the migrated
virtual machines.
If there are problems live-migrating between machines, it is still
possible to "failover" (`gnt-instance failover` instead of `migrate`)
which shuts off the machine, fails over disks, and starts it on the
other side. That's not such of a big problem: we often need to reboot
the guests when we reboot the hosts anyways. But it does complicate
our work. Of course, it's also possible that live migrates work fine
if *no* `cpu_type` at all is specified in the cluster, but that needs
to be verified.
Nodes could also [grouped](http://docs.ganeti.org/ganeti/2.15/man/gnt-group.html) to limit (automated) live migration to a
subset of nodes.
References:
* <https://dsa.debian.org/howto/install-ganeti/>
* <https://qemu.weilnetz.de/doc/qemu-doc.html#recommendations_005fcpu_005fmodels_005fx86>
### Network configuration
IP allocation is managed by Ganeti through the `gnt-network(8)`
system. Say we have `192.0.2.0/24` reserved for the cluster, with
the host IP `192.0.2.100`` and the gateway on `192.0.2.1`. You will
create this network with:
gnt-network add --network 192.0.2.0/24 --gateway 192.0.2.1 --network6 2001:db8::/32 --gateway6 fe80::1 example-network
Then we associate the new network to the default node group:
gnt-network connect --nic-parameters=link=br0,vlan=4000,mode=openvswitch example-network default
The arguments to `--nic-parameters` come from the values configured in
the cluster, above. The current values can be found with `gnt-cluster
info`.
TODO: create a private network.
## SLA
<!-- this describes an acceptable level of service for this service -->
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment