Ganeti is software designed to facilitate the management of virtual machines (KVM or Xen). It helps you move virtual machine instances from one node to another, create an instance with DRBD replication on another node and do the live migration from one to another, etc.
- Tutorial
-
How-to
- Glossary
- Adding a new instance
- Modifying an instance
- Destroying an instance
- Getting information
- Disk operations (DRBD)
- Evaluating cluster capacity
- Moving instances and failover
- Importing external libvirt instances
- Importing external libvirt instances, manual
- Rebooting
- Rebalancing a cluster
- Adding and removing addresses on instances
- Job inspection
- Pager playbook
- Disaster recovery
- Reference
- Discussion
Tutorial
Listing virtual machines (instances)
This will show the running guests, known as "instances":
gnt-instance list
Accessing serial console
Our instances do serial console, starting in grub. To access it, run
gnt-instance console test01.torproject.org
To exit, use ^]
-- that is, Control-<Closing Bracket>.
How-to
Glossary
In Ganeti, a physical machine is called a node and a virtual machine is an instance. A node is elected to be the master where all commands should be ran from. Nodes are interconnected through a private network that is used to communicate commands and synchronise disks (with howto/drbd). Instances are normally assigned two nodes: a primary and a secondary: the primary is where the virtual machine actually runs and th secondary acts as a hot failover.
See also the more extensive glossary in the Ganeti documentation.
Adding a new instance
This command creates a new guest, or "instance" in Ganeti's vocabulary with 10G root, 2G swap, 20G spare on SSD, 800G on HDD, 8GB ram and 2 CPU cores:
gnt-instance add \
-o debootstrap+buster \
-t drbd --no-wait-for-sync \
--net 0:ip=pool,network=gnt-fsn13-02 \
--no-ip-check \
--no-name-check \
--disk 0:size=10G \
--disk 1:size=2G,name=swap \
--disk 2:size=20G \
--disk 3:size=800G,vg=vg_ganeti_hdd \
--backend-parameters memory=8g,vcpus=2 \
test-01.torproject.org
What that does
This configures the following:
- redundant disks in a DRBD mirror, use
-t plain
instead of-t drbd
for tests as that avoids syncing of disks and will speed things up considerably (even with--no-wait-for-sync
there are some operations that block on synced mirrors). Only one node should be provided as the argument for--node
then. - three partitions: one on the default VG (SSD), one on another (HDD)
and a swap file on the default VG, if you don't specify a swap device,
a 512MB swapfile is created in
/swapfile
. TODO: configure disk 2 and 3 automatically in installer. (/var
and/srv
?) - 8GB of RAM with 2 virtual CPUs
- an IP allocated from the public gnt-fsn pool:
gnt-instance add
will print the IPv4 address it picked to stdout. The IPv6 address can be found in/var/log/ganeti/os/
on the primary node of the instance, see below. - with the
test-01.torproject.org
hostname
Next steps
To find the root password, ssh host key fingerprints, and the IPv6 address, run this on the node where the instance was created, for example:
egrep 'root password|configured eth0 with|SHA256' $(ls -tr /var/log/ganeti/os/* | tail -1) | grep -v $(hostname)
We copy root's authorized keys into the new instance, so you should be able to
log in with your token. You will be required to change the root password immediately.
Pick something nice and document it in tor-passwords
.
Also set reverse DNS for both IPv4 and IPv6 in hetzner's robot (Chek under servers -> vSwitch -> IPs) or in our own reverse zone files (if delegated).
Then follow howto/new-machine.
Known issues
-
usrmerge: that procedure creates a machine with usrmerge! See bug 34115 before proceeding.
-
allocator failures: Note that you may need to use the
--node
parameter to pick on which machines you want the machine to end up, otherwise Ganeti will choose for you (and may fail). Use, for example,--node fsn-node-01:fsn-node-02
to usenode-01
as primary andnode-02
as secondary. The allocator can sometimes fail if the allocator is upset about something in the cluster, for example:Can's find primary node using iallocator hail: Request failed: No valid allocation solutions, failure reasons: FailMem: 2, FailN1: 2
This situation is covered by ticket 33785. If this problem occurs, it might be worth rebalancing the cluster.
-
ping failure: there is a bug in
ganeti-instance-debootstrap
which misconfiguresping
(among other things), see bug 31781. It's currently patched in our version of the Debian package, but that patch might disappear if Debian upgrade the package without shipping our patch.
Other examples
This is the same without the HDD partition, in the gnt-chi
cluster:
gnt-instance add \
-o debootstrap+buster \
-t drbd --no-wait-for-sync \
--net 0:ip=pool,network=gnt-chi-01 \
--no-ip-check \
--no-name-check \
--disk 0:size=10G \
--disk 1:size=2G,name=swap \
--disk 2:size=20G \
--backend-parameters memory=8g,vcpus=2 \
test-01.torproject.org
A simple test machine, with only 1G of disk, ram, and 1 CPU, without DRBD, in the FSN cluster:
gnt-instance add \
-o debootstrap+buster \
-t plain --no-wait-for-sync \
--net 0:ip=pool,network=gnt-fsn13-02 \
--no-ip-check \
--no-name-check \
--disk 0:size=10G \
--disk 1:size=2G,name=swap \
--backend-parameters memory=1g,vcpus=1 \
test-01.torproject.org
Modifying an instance
CPU, memory changes
It's possible to change the IP, CPU, or memory allocation of an instance using the gnt-instance modify command:
gnt-instance modify -B vcpus=2 test1.torproject.org
gnt-instance modify -B memory=4g test1.torproject.org
gnt-instance reboot test1.torproject.org
IP address change
IP address changes require a full stop and will require manual changes
to the /etc/network/interfaces*
files:
gnt-instance modify --net 0:modify,ip=116.202.120.175 test1.torproject.org
gnt-instance stop test1.torproject.org
gnt-instance start test1.torproject.org
gnt-instance console test1.torproject.org
Resizing disks
The gnt-instance grow-disk command can be used to change the size of the underlying device:
gnt-instance grow-disk --absolute test1.torproject.org 0 16g
gnt-instance reboot test1.torproject.org
The number 0
in this context, indicates the first disk of the
instance. The amount specified is the final disk size (because of the
--absolute
flag). In the above example, the final disk size will be
16GB. To add space to the existing disk, remove the --absolute
flag:
gnt-instance grow-disk test1.torproject.org 0 16g
gnt-instance reboot test1.torproject.org
In the above example, 16GB will be ADDED to the disk. Be careful
with resizes, because it's not possible to revert such a change:
grow-disk
does support shrinking disks. The only way to revert the
change is by exporting / importing the instance.
Then the filesystem needs to be resized inside the VM:
ssh root@test1.torproject.org
Use pvs
to display information about the physical volumes:
root@cupani:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sdc vg_test lvm2 a-- <8.00g 1020.00m
Resize the physical volume to take up the new space:
pvresize /dev/sdc
Use lvs
to display information about logical volumes:
# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
var-opt vg_test-01 -wi-ao---- <10.00g
test-backup vg_test-01_hdd -wi-ao---- <20.00g
Use lvextend to add space to the volume:
lvextend -l '+100%FREE' vg_test-01/var-opt
Finally resize the filesystem:
resize2fs /dev/vg_test-01/var-opt
See also the LVM howto.
Adding disks
A disk can be added to an instance with the modify
command as
well. This, for example, will add a 100GB disk to the test1
instance
on teh vg_ganeti_hdd
volume group, which is "slow" rotating disks:
gnt-instance modify --disk add:size=100g,vg=vg_ganeti_hdd test1.torproject.org
gnt-instance reboot test1.torproject.org
Adding a network interface on the rfc1918 vlan
We have a vlan that some VMs that do not have public addresses sit on. Its vlanid is 4002 and its backed by Hetzner vswitch vSwitch #11973 "fsn-gnt-rfc1918-traffic". Note that traffic on this vlan will travel in the clear between nodes.
To add an instance to this vlan, give it a second network interface using
gnt-instance modify --net add:link=br0,vlan=4002,mode=openvswitch test1.torproject.org
Destroying an instance
This totally deletes the instance, including all mirrors and everything, be very careful with it:
gnt-instance remove test01.torproject.org