convert relay-01 into a TPA-RFC-7 root exception for network-health team
in #40392 (closed), we setup a box but we didn't give @dgoulet and the network-health people enough privileges.
in that ticket, it was agreed we needed a TPA-RFC-7 exception and just grant them root access. the rationale is, as stated by @dgoulet in that ticket:
First, without root, we have to deal with
sudo
and I'll need to ask the TPA team regularly to install various packages. And that list of packages over time is ever evolving often.The second thing is that some of the tracing tool load kernel modules and these modules can change or be changed by us over time. And so, a non root situation makes things all just time consuming and lot more painful to access any files on the system or logs.
Last thing is that we sometimes need to run services on the side of tor (dnsmasq, profilers, etc...) and I don't have a list and if I have to go through TPA for hours before I can get these, it is not worth it.
In other words, I'm not sure this machine is of any use to us if we can't get root in order to assess and investigate a lot of things from the kernel, tor and other services we might run.
The agreement is under the following conditions:
We'll grant you
sudo su
access (or whatever, the equivalent), but you'll still need to login with your regular user to get that access, I hope that is sufficient.We agree with this under the condition that "other services we might run" still remains in scope for this ticket. We don't want this box to become the place where you run random unrelated stuff "just because you have root" there.
[...]
We will disable alerting on this server so that you can break things without annoying us.
Also, if things break on this box, you're on your own. From here on, if things break, you get to keep both pieces. We'll help in a best effort basis, but expect "we can reinstall the server" as an answer to some tech support requests. :)
So the actual task here is to:
-
revert the profile::tor::systemd_user
class, although we might want to keep git and build-essential installed (or at least unmanaged) since they'll need it anyways (and probably remove that code altogether) -
deploy a "sudo su" (or equivalent) to allow the users of the network-health
team to "sudo to root" on boxes matching that role (and only those), probably reusing the code inprofile::tor::systemd_user
to deploy a snippet insudoers.d
-
disable altering (or monitoring?) or the server resources (but not ping?) in nagios somehow
@lavamind expressed interest in taking care of this.