... | ... | @@ -1583,6 +1583,175 @@ There are many sockets in the `ctrl` directory, including: |
|
|
(the `-qmp` argument to `qemu`)
|
|
|
* `.kvmd`: same as the above?
|
|
|
|
|
|
## Cross-cluster migrations
|
|
|
|
|
|
If an entire cluster needs to be evacuated, the [move-instance](https://docs.ganeti.org/docs/ganeti/3.0/html/move-instance.html)
|
|
|
command can be used to automatically propagate instances between
|
|
|
clusters.
|
|
|
|
|
|
### move-instance configuration
|
|
|
|
|
|
Some serious configuration needs to be accomplished before the
|
|
|
`move-instance` command can be used.
|
|
|
|
|
|
Note that it currently migrates only one VM at a time, because of the
|
|
|
`--net` argument, a limitation which could eventually be waived.
|
|
|
|
|
|
Also note that this procedure depends on a patched version of
|
|
|
`move-instance`, which was changed after the 3.0 Ganeti release, see
|
|
|
[this comment](https://github.com/ganeti/ganeti/issues/1696#issuecomment-1465221351) for details. We also have three patches on top of
|
|
|
that which fix various issues we have found during the gnt-chi to
|
|
|
gnt-dal migration, see [this comment](https://github.com/ganeti/ganeti/issues/1696#issuecomment-1470693963) and specifically the
|
|
|
following PRs:
|
|
|
|
|
|
* [GitHub ganeti#1697](https://github.com/ganeti/ganeti/pull/1697) - Python 3 tweak, optional
|
|
|
* [GitHub ganeti#1698](https://github.com/ganeti/ganeti/pull/1698) - network configuration hack, mandatory
|
|
|
* [GitHub ganeti#1699](https://github.com/ganeti/ganeti/pull/1699) - OpenSSL verification hack, mandatory
|
|
|
|
|
|
Once those patches have been deployed, use the following procedure to
|
|
|
migrate a VM. In this example, we migrate a VM named
|
|
|
`test-01.torproject.org` from the gnt-chi cluster to gnt-dal.
|
|
|
|
|
|
1. create a new secret on the source cluster:
|
|
|
|
|
|
chi-node-01# gnt-cluster renew-crypto --new-cluster-domain-secret
|
|
|
|
|
|
2. stop ganeti on the destination node:
|
|
|
|
|
|
dal-node-01# systemctl stop ganeti
|
|
|
|
|
|
3. copy the secret to the destination node
|
|
|
|
|
|
scp chi-node-01.torproject.org:/var/lib/ganeti/cluster-domain-secret dal-node-01.torproject.org:/var/lib/ganeti/cluster-domain-secret
|
|
|
ssh dal-node-01 "gnt-cluster renew-crypto && gnt-cluster redist-conf && gnt-cluster verify"
|
|
|
|
|
|
4. start ganeti on the destination node with the new secret:
|
|
|
|
|
|
ssh dal-node-01 systemctl start ganeti
|
|
|
|
|
|
3. extract the public key from the RAPI certificate on the source cluster:
|
|
|
|
|
|
ssh chi-node-01 sed -n '/BEGIN CERT/,$p' /var/lib/ganeti/rapi.pem
|
|
|
|
|
|
4. paste that in a certificate file on the target cluster:
|
|
|
|
|
|
ssh dal-node-01 tee gnt-chi.crt
|
|
|
|
|
|
5. disable Puppet, as we'll be messing with files it manages:
|
|
|
|
|
|
puppet agent --disable "messing with RAPI and /etc/hosts, see issue tpo/tpa/team#40972"
|
|
|
|
|
|
6. bypass the local DNS records for the "API" names on the
|
|
|
destination node (dal-node-01):
|
|
|
|
|
|
echo 38.229.82.104 chignt.torproject.org >> /etc/hosts
|
|
|
echo 204.8.99.101 dalgnt.torproject.org >> /etc/hosts
|
|
|
|
|
|
TODO: maybe those records should point at the public IP addresses
|
|
|
in the normal torproject.org zonefile? Right now it points at the
|
|
|
private IP space, but I'm not sure why.
|
|
|
|
|
|
7. make RAPI listen on the public network, on both master nodes:
|
|
|
|
|
|
echo 'RAPI_ARGS="--require-authentication"' >> /etc/default/ganeti
|
|
|
|
|
|
TODO: add a flag in Puppet to make this configurable, so that we
|
|
|
don't have to stop Puppet.
|
|
|
|
|
|
5. enable an [API user](https://docs.ganeti.org/docs/ganeti/3.0/html/rapi.html#users-and-passwords) on the source *and* on the target cluster:
|
|
|
|
|
|
echo move-instance $(tr -dc '[:alnum:]' < /dev/urandom | head -c 30) write >> /var/lib/ganeti/rapi/users
|
|
|
systemctl restart ganeti
|
|
|
|
|
|
TODO: add to Puppet
|
|
|
|
|
|
6. enter the passwords in two files on the target cluster, for
|
|
|
example:
|
|
|
|
|
|
cat > gnt-chi.password
|
|
|
cat > gnt-dal.password
|
|
|
|
|
|
7. open up the firewall on all nodes to all nodes, between both
|
|
|
clusters:
|
|
|
|
|
|
ssh chi-node-01 gnt-cluster command "iptables-legacy -I ganeti-cluster -j ACCEPT -s 204.8.99.96/27"
|
|
|
ssh dal-node-01 gnt-cluster command "iptables-legacy -I ganeti-cluster -j ACCEPT -s 38.229.82.104/27"
|
|
|
|
|
|
### Actual VM migration
|
|
|
|
|
|
Once the above configuration is completed, the following procedure
|
|
|
will move *one* VM, in this example the `test-01` VM from the `chignt`
|
|
|
to the `dalgnt` cluster:
|
|
|
|
|
|
1. stop the VM, on the source cluster:
|
|
|
|
|
|
gnt-instance stop test-01
|
|
|
|
|
|
2. move the VM to the new cluster:
|
|
|
|
|
|
/usr/lib/ganeti/tools/move-instance \
|
|
|
chignt.torproject.org \
|
|
|
dalgnt.torproject.org \
|
|
|
test-01.torproject.org \
|
|
|
--src-ca-file=chignt.crt \
|
|
|
--dest-ca-file=/var/lib/ganeti/rapi.pem \
|
|
|
--src-username=move-instance \
|
|
|
--src-password-file=gnt-chi.password \
|
|
|
--dest-username=move-instance \
|
|
|
--dest-password-file=gnt-dal.password \
|
|
|
--src-rapi-port=5080 \
|
|
|
--dest-rapi-port=5080 \
|
|
|
--net 0:ip=pool,network=gnt-dal-01,mode=,link= \
|
|
|
--keep-source-instance \
|
|
|
--verbose \
|
|
|
|
|
|
3. change the IP address inside the VM:
|
|
|
|
|
|
tsa-misc$ ./ganeti -H test-01.torproject.org -v renumber-instance dal-node-02.torproject.org
|
|
|
|
|
|
Note how we use the name of the Ganeti node where the VM resides.
|
|
|
|
|
|
TODO: the above rewrites `/etc/network/interfaces` while many VMs
|
|
|
actually configure `/etc/network/interfaces.d/eth0` instead
|
|
|
|
|
|
4. test the new VM
|
|
|
|
|
|
5. if satisfied, change DNS to new VM
|
|
|
|
|
|
6. schedule destruction of the old VM (7 days)
|
|
|
|
|
|
This procedure was tested on a test VM migrating from gnt-chi to
|
|
|
gnt-dal, see [tpo/tpa/team#40972][] for the gory details.
|
|
|
|
|
|
[tpo/tpa/team#40972]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40972)
|
|
|
|
|
|
### Troubleshooting
|
|
|
|
|
|
If the above procedure doesn't work, try again with `--debug` instead
|
|
|
of `--verbose`, you might see extra error messages. The import/export
|
|
|
logs can also be visible in `/var/log/ganeti/os/...`.
|
|
|
|
|
|
Finally, to trace which commands are executed (which can be
|
|
|
challenging in Ganeti), the `execsnoop.bt` command (from the [bpftrace
|
|
|
package](https://tracker.debian.org/bpftrace)) is invaluable. Make sure the `debugfs` is loaded first
|
|
|
and the package installed:
|
|
|
|
|
|
mount -t debugfs debugfs /sys/kernel/debug
|
|
|
apt install bpftrace
|
|
|
|
|
|
Then simply run:
|
|
|
|
|
|
execsnoop.bt
|
|
|
|
|
|
This will show *every* [`execve(2)`](https://manpages.debian.org/execve.2) system call executed on the
|
|
|
system. Filtering is probably a good idea, in my case I was doing:
|
|
|
|
|
|
execsnoop.bt | grep socat
|
|
|
|
|
|
The `execsnoop` command (from the [libbpf-tools package](https://tracker.debian.org/libbbpf-tools)) may also
|
|
|
work but it truncates the command after 128 characters ([Debian
|
|
|
1033013](https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1033013), [upstream 740](https://github.com/iovisor/bcc/issues/740)).
|
|
|
|
|
|
## Pager playbook
|
|
|
|
|
|
### I/O overload
|
... | ... | @@ -2191,156 +2360,13 @@ possible through things like [zerofree](https://tracker.debian.org/pkg/zerofree) |
|
|
|
|
|
### Mass migrating instances to a new cluster
|
|
|
|
|
|
If an entire cluster needs to be evacuated, the [move-instance](https://docs.ganeti.org/docs/ganeti/3.0/html/move-instance.html) can
|
|
|
be used to automatically propagate instances between clusters. It
|
|
|
currently migrates only one VM at a time (because of the `--net`
|
|
|
argument, a limitation which could eventually be waived), but should
|
|
|
be easier to do than the export/import procedure above.
|
|
|
|
|
|
Note that this procedure depends on a patched version of
|
|
|
`move-instance`, which was changed after the 3.0 Ganeti release, see
|
|
|
[this comment](https://github.com/ganeti/ganeti/issues/1696#issuecomment-1465221351) for details. We also have three patches on top of
|
|
|
that which fix various issues we have found during the gnt-chi to
|
|
|
gnt-dal migration, see [this comment](https://github.com/ganeti/ganeti/issues/1696#issuecomment-1470693963) and specifically the
|
|
|
following PRs:
|
|
|
|
|
|
* [GitHub ganeti#1697](https://github.com/ganeti/ganeti/pull/1697) - Python 3 tweak, optional
|
|
|
* [GitHub ganeti#1698](https://github.com/ganeti/ganeti/pull/1698) - network configuration hack, mandatory
|
|
|
* [GitHub ganeti#1699](https://github.com/ganeti/ganeti/pull/1699) - OpenSSL verification hack, mandatory
|
|
|
|
|
|
Once those patches have been deployed, use the following procedure to
|
|
|
migrate a VM. In this example, we migrate a VM named
|
|
|
`test-01.torproject.org` from the gnt-chi cluster to gnt-dal.
|
|
|
|
|
|
1. create a new secret on the source cluster:
|
|
|
|
|
|
chi-node-01# gnt-cluster renew-crypto --new-cluster-domain-secret
|
|
|
|
|
|
2. stop ganeti on the destination node:
|
|
|
|
|
|
dal-node-01# systemctl stop ganeti
|
|
|
|
|
|
3. copy the secret to the destination node
|
|
|
|
|
|
scp chi-node-01.torproject.org:/var/lib/ganeti/cluster-domain-secret dal-node-01.torproject.org:/var/lib/ganeti/cluster-domain-secret
|
|
|
ssh dal-node-01 "gnt-cluster renew-crypto && gnt-cluster redist-conf && gnt-cluster verify"
|
|
|
|
|
|
4. start ganeti on the destination node with the new secret:
|
|
|
|
|
|
ssh dal-node-01 systemctl start ganeti
|
|
|
|
|
|
3. extract the public key from the RAPI certificate on the source cluster:
|
|
|
|
|
|
ssh chi-node-01 sed -n '/BEGIN CERT/,$p' /var/lib/ganeti/rapi.pem
|
|
|
|
|
|
4. paste that in a certificate file on the target cluster:
|
|
|
|
|
|
ssh dal-node-01 tee gnt-chi.crt
|
|
|
|
|
|
5. disable Puppet, as we'll be messing with files it manages:
|
|
|
|
|
|
puppet agent --disable "messing with RAPI and /etc/hosts, see issue tpo/tpa/team#40972"
|
|
|
|
|
|
6. bypass the local DNS records for the "API" names on the
|
|
|
destination node (dal-node-01):
|
|
|
|
|
|
echo 38.229.82.104 chignt.torproject.org >> /etc/hosts
|
|
|
echo 204.8.99.101 dalgnt.torproject.org >> /etc/hosts
|
|
|
|
|
|
TODO: maybe those records should point at the public IP addresses
|
|
|
in the normal torproject.org zonefile? Right now it points at the
|
|
|
private IP space, but I'm not sure why.
|
|
|
|
|
|
7. make RAPI listen on the public network, on both master nodes:
|
|
|
|
|
|
echo 'RAPI_ARGS="--require-authentication"' >> /etc/default/ganeti
|
|
|
|
|
|
TODO: add a flag in Puppet to make this configurable, so that we
|
|
|
don't have to stop Puppet.
|
|
|
|
|
|
5. enable an [API user](https://docs.ganeti.org/docs/ganeti/3.0/html/rapi.html#users-and-passwords) on the source *and* on the target cluster:
|
|
|
|
|
|
echo move-instance $(tr -dc '[:alnum:]' < /dev/urandom | head -c 30) write >> /var/lib/ganeti/rapi/users
|
|
|
systemctl restart ganeti
|
|
|
|
|
|
TODO: add to Puppet
|
|
|
|
|
|
6. enter the passwords in two files on the target cluster, for
|
|
|
example:
|
|
|
|
|
|
cat > gnt-chi.password
|
|
|
cat > gnt-dal.password
|
|
|
|
|
|
7. open up the firewall on all nodes to all nodes, between both
|
|
|
clusters:
|
|
|
|
|
|
ssh chi-node-01 gnt-cluster command "iptables-legacy -I ganeti-cluster -j ACCEPT -s 204.8.99.96/27"
|
|
|
ssh dal-node-01 gnt-cluster command "iptables-legacy -I ganeti-cluster -j ACCEPT -s 38.229.82.104/27"
|
|
|
|
|
|
8. then this mouthful:
|
|
|
|
|
|
/usr/lib/ganeti/tools/move-instance \
|
|
|
chignt.torproject.org \
|
|
|
dalgnt.torproject.org \
|
|
|
test-01.torproject.org \
|
|
|
--src-ca-file=chignt.crt \
|
|
|
--dest-ca-file=/var/lib/ganeti/rapi.pem \
|
|
|
--src-username=move-instance \
|
|
|
--src-password-file=gnt-chi.password \
|
|
|
--dest-username=move-instance \
|
|
|
--dest-password-file=gnt-dal.password \
|
|
|
--src-rapi-port=5080 \
|
|
|
--dest-rapi-port=5080 \
|
|
|
--net 0:ip=pool,network=gnt-dal-01,mode=,link= \
|
|
|
--keep-source-instance \
|
|
|
--verbose \
|
|
|
If an entire cluster needs to be evacuated, the [move-instance](https://docs.ganeti.org/docs/ganeti/3.0/html/move-instance.html)
|
|
|
command can be used to automatically propagate instances between
|
|
|
clusters. It currently migrates only one VM at a time (because of the
|
|
|
`--net` argument, a limitation which could eventually be waived), but
|
|
|
should be easier to do than the export/import procedure above.
|
|
|
|
|
|
9. finally, the IP address inside the VM must be changed:
|
|
|
|
|
|
tsa-misc$ ./ganeti -H test-01.torproject.org -v renumber-instance dal-node-02.torproject.org
|
|
|
|
|
|
Note how we use the name of the Ganeti node where the VM resides.
|
|
|
|
|
|
TODO: the above rewrites `/etc/network/interfaces` while many VMs
|
|
|
actually configure `/etc/network/interfaces.d/eth0` instead
|
|
|
|
|
|
10. test the new VM
|
|
|
|
|
|
11. if satisfied, change DNS to new VM
|
|
|
|
|
|
12. schedule destruction of the old VM (7 days)
|
|
|
|
|
|
This procedure was tested on a test VM migrating from gnt-chi to
|
|
|
gnt-dal, see [tpo/tpa/team#40972][] for the gory details.
|
|
|
|
|
|
[tpo/tpa/team#40972]: https://gitlab.torproject.org/tpo/tpa/team/-/issues/40972)
|
|
|
|
|
|
#### Troubleshooting
|
|
|
|
|
|
If the above procedure doesn't work, try again with `--debug` instead
|
|
|
of `--verbose`, you might see extra error messages. The import/export
|
|
|
logs can also be visible in `/var/log/ganeti/os/...`.
|
|
|
|
|
|
Finally, to trace which commands are executed (which can be
|
|
|
challenging in Ganeti), the `execsnoop.bt` command (from the [bpftrace
|
|
|
package](https://tracker.debian.org/bpftrace)) is invaluable. Make sure the `debugfs` is loaded first
|
|
|
and the package installed:
|
|
|
|
|
|
mount -t debugfs debugfs /sys/kernel/debug
|
|
|
apt install bpftrace
|
|
|
|
|
|
Then simply run:
|
|
|
|
|
|
execsnoop.bt
|
|
|
|
|
|
This will show *every* [`execve(2)`](https://manpages.debian.org/execve.2) system call executed on the
|
|
|
system. Filtering is probably a good idea, in my case I was doing:
|
|
|
|
|
|
execsnoop.bt | grep socat
|
|
|
|
|
|
The `execsnoop` command (from the [libbpf-tools package](https://tracker.debian.org/libbbpf-tools)) may also
|
|
|
work but it truncates the command after 128 characters ([Debian
|
|
|
1033013](https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1033013), [upstream 740](https://github.com/iovisor/bcc/issues/740)).
|
|
|
See the detailed [cross-cluster migration instructions](#cross-cluster-migration).
|
|
|
|
|
|
### Reboot procedures
|
|
|
|
... | ... | @@ -2444,7 +2470,7 @@ If things get completely out of hand and the cluster becomes too |
|
|
unreliable for service, the only solution is to rebuild another one
|
|
|
elsewhere. Since Ganeti 2.2, there is a [move-instance](https://docs.ganeti.org/docs/ganeti/3.0/html/move-instance.html) command to
|
|
|
move instances between cluster that can be used for that purpose. See
|
|
|
the [mass migration procedure](#mass-migrating-instances-to-a-new-cluster) above.
|
|
|
the [mass migration procedure](#cross-cluster-migration) above.
|
|
|
|
|
|
The [export/import](https://docs.ganeti.org/docs/ganeti/3.0/html/admin.html#export-import) mechanism can also be used to export and import
|
|
|
VMs one at a time, if only a subset of the cluster needs to be
|
... | ... | |