port drbd pager playbook to prometheus authored by anarcat's avatar anarcat
...@@ -114,14 +114,13 @@ Then that device map can be removed with: ...@@ -114,14 +114,13 @@ Then that device map can be removed with:
### Resyncing disks ### Resyncing disks
In Nagios, if you see this warning: A `DRBDDegraded` alert looks like this:
DRBD CRITICAL: Device 10 WFConnection UpToDate, Device 9 WFConnection UpToDate DRBD has 1 out of date disks on fsn-node-04.torproject.org
It means that, on that host (in my case it was It means that, on that host (in this case
`fsn-node-04.torproject.org`), disks are desynchronized for some `fsn-node-04.torproject.org`), disks are desynchronized for some
reason. In this case, those are disks 9 and 10. You can confirm that reason. You can confirm that on the host:
on the host:
# ssh fsn-node-04.torproject.org cat /proc/drbd # ssh fsn-node-04.torproject.org cat /proc/drbd
[...] [...]
...@@ -132,7 +131,8 @@ on the host: ...@@ -132,7 +131,8 @@ on the host:
[...] [...]
You need to find which instance this disk is associated with (see also You need to find which instance this disk is associated with (see also
above): above), by asking the Ganeti master for the DRBD disk listing with
`gnt-node list-drbd $NODE`:
$ ssh fsn-node-01.torproject.org gnt-node list-drbd fsn-node-04 $ ssh fsn-node-01.torproject.org gnt-node list-drbd fsn-node-04
[...] [...]
... ...
......