Skip to content
Snippets Groups Projects
Unverified Commit 2c5f0e67 authored by anarcat's avatar anarcat
Browse files

more iscsi docs

parent 53d2e1c5
No related branches found
No related tags found
No related merge requests found
......@@ -779,7 +779,7 @@ Details (like speed in RPMs) can also be seen with:
show allPhysicalDisks;
### Adding a host
### Host and group management
The existing machines in the `gnt-chi` cluster were all added at once,
alongside a group, with this script:
......@@ -819,6 +819,37 @@ host:
iscsiadm -m session -P 1 | grep 'Iface Initiatorname' | sort -u
Note that the above doesn't take into account CHAP authentication,
covered below.
### CHAP authentication
While we trust the local network (iSCSI is, after all, in the clear),
as a safety precaution, we do have password-based ([CHAP](https://en.wikipedia.org/wiki/Challenge-Handshake_Authentication_Protocol))
authentication between the clients and the server. This is configured
on the `iscsiInitiator` object on the SAN, with a setting like:
set iscsiInitiator ["chi-node-01-iscsi"] chapSecret="[REDACTED]";
The password comes from Trocla, in Puppet. It can be found in:
grep node.session.auth.password /etc/iscsi/iscsid.conf
The client's "username" is the iSCSI initiator identifier, which maps
to the `iscsiName` setting on the SAN side. For `chi-node-01`, it
looks something like:
iqn.1993-08.org.debian:01:chi-node-01
See above for details on the iSCSI initiator setup.
We do one way CHAP authentication (the clients authenticate to the
server). We do not do it both ways, because we have multiple SAN
servers and we haven't figured out how to make `iscsid` talk to
multiple SANs at once (there's only *one*
`node.session.auth.username_in`, and it's the iSCSI target identifier,
so it can't be the same across SANs).
### Creating a disk
This will create a disk:
......@@ -862,10 +893,13 @@ host group:
This will actually not show up on the clients until they run:
iscsiadm -m node --rescan
iscsiadm -m node --logout
Note that this will make *ALL* disks disappear on the client side,
which can obviously cause catastrophic failures.
Note that this will make the disk disappear on the client side, which
can obviously cause catastrophic failures.
TODO: Obviously, a better way is needed to handle removal of a single
disk.
To actually delete a disk:
......@@ -1190,6 +1224,46 @@ for auto disk replacement of failed hard drives.
Upstream has a [technical guide book](https://www.dell.com/downloads/global/products/pvaul/en/powervault-md3200i-md3220i-technical-guidebook-en.pdf) with more complete
specifications.
The machines do not run a regular operating system (like, say Linux),
or at least does not provide traditional commandline-based interfaces
like telnet, SSH or even a web interface. Operations are performed
through a proprietary tool called "SMcli", detailed below.
Here's the exaustive list of the hardware RAID units -- which we call
[SAN][] -- and whether it is correctly setup (checked):
* [x] `chi-san-01`: ~28TiB total: 28 1TB 7200 RPM drives
* [ ] `chi-san-02`: ~40TiB total: 40 1TB 7200 RPM drives
* [ ] `chi-san-03`: ~36TiB total: 47 800GB 10000 RPM drives
* [ ] `chi-san-04`: ~38TiB total, 48 800GB 10000 RPM drives
* Total: 144TiB, not counting mirrors (around 72TiB total in RAID-1,
96TiB in RAID-5)
A node that is correctly setup has the correct host groups, hosts, and
iSCSI initiators setup, with CHAP passwords.
All SANs were checked for the following during the [original
setup](https://gitlab.torproject.org/tpo/tpa/team/-/issues/40131):
* [x] batteries status ("optimal")
* [x] correct labeling (`chi-san-0X`)
* [ ] disk inventory (replace or disable all failing disks)
* [ ] setup spares
Spare disks can easily be found at [harddrivesdirect.com](https://harddrivesdirect.com/dell_build_PowerVault_MD3220i.php), but are
fairly expensive for this platform (115$USD for 1TB 7.2k RPM, 145$USD
for 10kRPM). It seems like the highest density per drive they have
available is 2TB, which would give us about 80TiB per server, but at
the whopping cost of 12,440$USD ([$255 per unit in a 5-pack](https://harddrivesdirect.com/product_info.php?products_id=483015))!
It must be said that this site takes a heavy markup... The typical
drive used in the array (Seagate ST9900805SS, 1TB 7.2k RPM) sells for
[186$USD](https://www.harddrivesdirect.com/product_info.php?products_id=459510_ST9900805SS) right now, while it's [154$USD at NewEgg](https://www.newegg.ca/p/pl?d=ST9900805SS) and [90$USD
at Amazon](https://www.amazon.com/Seagate-Savvio-2-5-Inch-Internal-ST9900805SS/dp/B008HDKUKC/ref=sr_1_2?dchild=1&keywords=ST9900805SS&qid=1614802728&refinements=p_n_feature_four_browse-bin%3A676133011&rnid=676128011&s=pc&sr=1-2). Worse, a typical Seagate IronWolf 8TB SATA sells for
[516$USD](https://www.harddrivesdirect.com/product_info.php?products_id=493505_ST8000VN0022&PHPSESSID=4i2otsr7tndtnr97a53nd7sqt4) while [Newegg lists them at 290$USD](https://www.newegg.ca/p/pl?d=ST8000VN0022). That "same day
delivery" has a cost... And it's actually fairly hard to find those
old drives in other sites, so we probably pay a premium there as well.
### iSCSI cluster management tools setup
The access the iSCSI servers, you need to setup the (proprietary)
......@@ -1328,7 +1402,7 @@ Then the device is available as a unique device in:
## Glossary
* **[SAN](https://en.wikipedia.org/wiki/Storage_area_network)**: storage area network
* **[SAN][]**: storage area network
* **[iSCSI](https://en.wikipedia.org/wiki/ISCSI)**: SCSI over "internet", allows block devices to be
mounted over TCP/IP
* **iSCSI initiator**: an iSCSI "client"
......@@ -1339,6 +1413,7 @@ Then the device is available as a unique device in:
multiple network interfaces on the initiator, target, running over
distinct network switches (or at least VLANs)
[SAN]: https://en.wikipedia.org/wiki/Storage_area_network
## Network topoloy
The network at Cymru is split into different VLANs:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment