@@ -878,6 +878,22 @@ catastrophic data loss bug in Ganeti or [[drbd]].
The above assumes that `fsngnt` is already in DNS.
13. make sure everything is great in the cluster:
gnt-cluster verify
If that takes a long time and eventually fails with erors like:
ERROR: node fsn-node-03.torproject.org: ssh communication with node 'fsn-node-06.torproject.org': ssh problem: ssh: connect to host fsn-node-06.torproject.org port 22: Connection timed out\'r\n
... that is because the [[ipsec]] tunnels between the nodes are
failing. Make sure Puppet has run across the cluster (step 10
above) and see [[ipsec]] for further diagnostics. For example,
the above would be fixed with:
ssh fsn-node-03.torproject.org "puppet agent -t; service ipsec reload"
ssh fsn-node-06.torproject.org "puppet agent -t; service ipsec reload; ipsec up gnt-fsn-be::fsn-node-03"
### cluster config
These could probably be merged into the cluster init, but just to document what has been done: