... | ... | @@ -2173,7 +2173,92 @@ possible through things like [zerofree](https://tracker.debian.org/pkg/zerofree) |
|
|
|
|
|
The [move-instance](https://docs.ganeti.org/docs/ganeti/3.0/html/move-instance.html) command can do this.
|
|
|
|
|
|
TODO: document mass cluster migrations.
|
|
|
1. create a new secret on the source cluster:
|
|
|
|
|
|
chi-node-01# gnt-cluster renew-crypto --new-cluster-domain-secret
|
|
|
|
|
|
2. stop ganeti on the destination node:
|
|
|
|
|
|
dal-node-01# systemctl stop ganeti
|
|
|
|
|
|
3. copy the secret to the destination node
|
|
|
|
|
|
scp chi-node-01.torproject.org:/var/lib/ganeti/cluster-domain-secret dal-node-01.torproject.org:/var/lib/ganeti/cluster-domain-secret
|
|
|
ssh dal-node-01 "gnt-cluster renew-crypto"
|
|
|
ssh dal-node-01 "gnt-cluster verify"
|
|
|
|
|
|
3. extract the public key from the RAPI certificate on the source cluster:
|
|
|
|
|
|
sed -n '/BEGIN CERT/,$p' /var/lib/ganeti/rapi.pem
|
|
|
|
|
|
4. paste that in a certificate file on the target cluster:
|
|
|
|
|
|
cat > gnt-chi.crt
|
|
|
|
|
|
5. enable an [API user](https://docs.ganeti.org/docs/ganeti/3.0/html/rapi.html#users-and-passwords) on the source *and* on the target cluster:
|
|
|
|
|
|
echo gnt-dal $(tr -dc '[:alnum:]' < /dev/urandom | head -c 30) write >> /var/lib/ganeti/rapi_users
|
|
|
systemctl restart ganeti
|
|
|
|
|
|
6. enter the passwords in two files on the target cluster, for
|
|
|
example:
|
|
|
|
|
|
cat > gnt-chi.password
|
|
|
cat > gnt-dal.password
|
|
|
|
|
|
7. wave your hands around to make a tunnel between the two hosts or
|
|
|
*gasp* open the firewall up
|
|
|
|
|
|
8. then this mouthful:
|
|
|
|
|
|
/usr/lib/ganeti/tools/move-instance \
|
|
|
chignt.torproject.org \
|
|
|
dalgnt.torproject.org \
|
|
|
test-01.torproject.org \
|
|
|
--src-ca-file=chignt.crt \
|
|
|
--dest-ca-file=/var/lib/ganeti/rapi.pem \
|
|
|
--opportunistic-tries=1 \
|
|
|
--iallocator=hail \
|
|
|
--src-username=gnt-dal \
|
|
|
--src-password-file=gnt-chi.users \
|
|
|
--dest-username=gnt-chi \
|
|
|
--dest-password-file=gnt-dal.users \
|
|
|
--src-rapi-port=5081 \
|
|
|
--dest-rapi-port=5080 \
|
|
|
--net 0:ip=pool,network=gnt-dal-01 \
|
|
|
--debug \
|
|
|
|
|
|
Note that the `/var/lib/ganeti/rapi/users` files get overwritten
|
|
|
by Puppet, so that might be cleaned up after (or during) your
|
|
|
attempt.
|
|
|
|
|
|
Currently fails with:
|
|
|
|
|
|
==> /var/log/ganeti/jobs.log <==
|
|
|
2023-03-06 21:57:25,346: job-1270 pid=1733692 ERROR Op 1/1: Caught exception in INSTANCE_CREATE(test-01.torproject.org)
|
|
|
Traceback (most recent call last):
|
|
|
File "/usr/share/ganeti/3.0/ganeti/jqueue/__init__.py", line 933, in _ExecOpCodeUnlocked
|
|
|
result = self.opexec_fn(op.input,
|
|
|
File "/usr/share/ganeti/3.0/ganeti/mcpu.py", line 705, in ExecOpCode
|
|
|
result = self._LockAndExecLU(lu, locking.LEVEL_CLUSTER + 1,
|
|
|
File "/usr/share/ganeti/3.0/ganeti/mcpu.py", line 631, in _LockAndExecLU
|
|
|
result = self._LockAndExecLU(lu, level + 1, calc_timeout,
|
|
|
File "/usr/share/ganeti/3.0/ganeti/mcpu.py", line 631, in _LockAndExecLU
|
|
|
result = self._LockAndExecLU(lu, level + 1, calc_timeout,
|
|
|
File "/usr/share/ganeti/3.0/ganeti/mcpu.py", line 631, in _LockAndExecLU
|
|
|
result = self._LockAndExecLU(lu, level + 1, calc_timeout,
|
|
|
[Previous line repeated 1 more time]
|
|
|
File "/usr/share/ganeti/3.0/ganeti/mcpu.py", line 639, in _LockAndExecLU
|
|
|
result = self._LockAndExecLU(lu, level + 1, calc_timeout, pending=pending)
|
|
|
File "/usr/share/ganeti/3.0/ganeti/mcpu.py", line 547, in _LockAndExecLU
|
|
|
result = self._ExecLU(lu)
|
|
|
File "/usr/share/ganeti/3.0/ganeti/mcpu.py", line 483, in _ExecLU
|
|
|
lu.CheckPrereq()
|
|
|
File "/usr/share/ganeti/3.0/ganeti/cmdlib/instance_create.py", line 827, in CheckPrereq
|
|
|
self.nics = ComputeNics(self.op, cluster, self.check_ip, self.cfg,
|
|
|
File "/usr/share/ganeti/3.0/ganeti/cmdlib/instance_utils.py", line 1240, in ComputeNics
|
|
|
raise errors.OpPrereqError("If network is given, no mode or link"
|
|
|
ganeti.errors.OpPrereqError: ('If network is given, no mode or link is allowed to be passed', 'wrong_input')
|
|
|
|
|
|
### Reboot procedures
|
|
|
|
... | ... | |