Skip to content
Snippets Groups Projects
Unverified Commit 3462b1d6 authored by anarcat's avatar anarcat
Browse files

add a verify step to make sure the cluster actually works

parent 125cec6e
No related branches found
No related tags found
No related merge requests found
......@@ -878,6 +878,22 @@ catastrophic data loss bug in Ganeti or [[drbd]].
The above assumes that `fsngnt` is already in DNS.
13. make sure everything is great in the cluster:
gnt-cluster verify
If that takes a long time and eventually fails with erors like:
ERROR: node fsn-node-03.torproject.org: ssh communication with node 'fsn-node-06.torproject.org': ssh problem: ssh: connect to host fsn-node-06.torproject.org port 22: Connection timed out\'r\n
... that is because the [[ipsec]] tunnels between the nodes are
failing. Make sure Puppet has run across the cluster (step 10
above) and see [[ipsec]] for further diagnostics. For example,
the above would be fixed with:
ssh fsn-node-03.torproject.org "puppet agent -t; service ipsec reload"
ssh fsn-node-06.torproject.org "puppet agent -t; service ipsec reload; ipsec up gnt-fsn-be::fsn-node-03"
### cluster config
These could probably be merged into the cluster init, but just to document what has been done:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment