build a VPN / jumphost for the gnt-dal cluster
it seems our quintex PoP will require our own OOB network, as the one provided by upstream is not exactly to our liking. or, to be more specific, it's a combination of limitations in the BIOS (not possible to upload an image bigger than 1.44MB) and limitations in the VPN (not possible to serve files from the clients) that make booting a rescue system overly complicated.
furthermore, we feel this is a problem we often have to fix. bootstrapping and rescue on the cymru cluster was also hellish, having a remote box under our control would have facilitated this immensely.
me and @lavamind started working on a network design for this, documented in:
https://gitlab.torproject.org/tpo/tpa/team/-/wikis/howto/quintex#network-topology
basically, the idea is to have a dedicated machine hooked to the uplink, but also the OOB/IPMI/BIOS network and the internal management network (e.g. where DRBD lives, to offer PXE boot, since that shouldn't happen on the public uplink and *can't happen on the OOB network). that can be done with a single network port with VLAN tagging, or (more simply) with three different ports on the device.
ideally, the device must be small to avoid any supplementary costs in rack space, and low power, to avoid costs in power. it should also be rugged to avoid requiring too much hardware maintenance (e.g. all solid state, ideally RAID-1).
checklist:
-
hardware design -
expense approval -
hardware order -
hardware shipped -
machine name ( dal-rescue-01
?) -
OS bootstrap -
Puppet setup -
shoelaces install -
DHCP server configuration -
iPXE image build -
grml image builds -
DHCP / grml test boot on eth1 -
naming convention tweak (wiki-replica!39 (merged)) -
IP allocation -
VLAN setup -
label ports and dal-rescue-01 -
second dal-rescue-02 setup -
label dal-rescue-02 -
IP ACLbuilt-in firewall rules considered sufficient -
ship dal-rescue-01 -
renumber iDRACs? (maybe split in another ticket?)see #41135 (closed)