... | ... | @@ -766,6 +766,44 @@ the cluster-level tag is a *prefix* that can be used to create |
|
|
*multiple* such tags. This configuration also happens to be simpler
|
|
|
and easier to use...
|
|
|
|
|
|
## HDD migration restrictions
|
|
|
|
|
|
Cluster balancing works well until there are inconsistencies between
|
|
|
how nodes are configured. In our case, some nodes have HDDs (Hard Disk
|
|
|
Drives, AKA spinning rust) and others do not. Therefore, it's not
|
|
|
possible to move an instance from a node with a disk allocated on the
|
|
|
HDD to a node that does not have such a disk.
|
|
|
|
|
|
Yet somehow the allocator is not smart enough to tell, and you will
|
|
|
get the following error when doing an automatic rebalancing:
|
|
|
|
|
|
one of the migrate failed and stopped the cluster balance: Can't create block device: Can't create block device <LogicalVolume(/dev/vg_ganeti_hdd/98d30e7d-0a47-4a7d-aeed-6301645d8469.disk3_data, visible as /dev/, size=102400m)> on node fsn-node-07.torproject.org for instance gitlab-02.torproject.org: Can't create block device: Can't compute PV info for vg vg_ganeti_hdd
|
|
|
|
|
|
In this case, it is trying to migrate the `gitlab-02` server from
|
|
|
`fsn-node-01` (which has an HDD) to `fsn-node-07` (which hasn't),
|
|
|
which naturally fails. This is a known limitation of the Ganeti
|
|
|
code. There has been a [draft design document for multiple storage
|
|
|
unit support](http://docs.ganeti.org/ganeti/master/html/design-multi-storage-htools.html) since 2015, but it has [never been
|
|
|
implemented](https://github.com/ganeti/ganeti/issues/865). There has been multiple issues reported upstream on
|
|
|
the subject:
|
|
|
|
|
|
* [208: Bad behaviour when multiple volume groups exists on nodes](https://github.com/ganeti/ganeti/issues/208)
|
|
|
* [1199: unable to mark storage as unavailable for allocation](https://github.com/ganeti/ganeti/issues/1199)
|
|
|
* [1240: Disk space check with multiple VGs is broken](https://github.com/ganeti/ganeti/issues/1240)
|
|
|
* [1379: Support for displaying/handling multiple volume groups](https://github.com/ganeti/ganeti/issues/1379)
|
|
|
|
|
|
Unfortunately, there are no known workarounds for this, at least not
|
|
|
that fix the `hbal` command. It *is* possible to exclude the faulty
|
|
|
migration from the pool of possible moves, however, for example in the
|
|
|
above case:
|
|
|
|
|
|
hbal -L -v --exclude-instances gitlab-02.torproject.org
|
|
|
|
|
|
It's also possible to use the `--no-disk-moves` option to avoid disk
|
|
|
move operations altogether.
|
|
|
|
|
|
Both workarounds obviously do not correctly balance the cluster...
|
|
|
|
|
|
## Adding and removing addresses on instances
|
|
|
|
|
|
Say you created an instance but forgot to need to assign an extra
|
... | ... | |