... | ... | @@ -910,17 +910,50 @@ distribution, user and some service management is done from a central |
|
|
location, managed in a git repository. This approach is often called
|
|
|
[Infrastructure as code](https://en.wikipedia.org/wiki/Infrastructure_as_Code).
|
|
|
|
|
|
This section also documents possible improvements to our Puppet
|
|
|
configuration that we are considering.
|
|
|
|
|
|
### Must have
|
|
|
|
|
|
TODO.
|
|
|
* **secure**: only sysadmins should have access to push configuration,
|
|
|
whatever happens. this includes deploying only audited and verified
|
|
|
Puppet code into production.
|
|
|
* **code review**: changes on servers should be verifiable by our peers,
|
|
|
through a git commit log
|
|
|
* **fix permissions issues**: deployment system should allow all admins
|
|
|
to push code to the puppet server without having to constantly fix
|
|
|
permissions (e.g. trough a [role account](https://gitlab.torproject.org/tpo/tpa/team/-/issues/29663))
|
|
|
* **secrets handling**: there are some secrets in Puppet. those
|
|
|
should remain secret.
|
|
|
|
|
|
We mostly have this now, although there are concerns about permissions
|
|
|
being wrong sometimes, which a role account could fix.
|
|
|
|
|
|
### Nice to have
|
|
|
|
|
|
TODO.
|
|
|
Those are mostly issues with the current architecture we'd like to fix:
|
|
|
|
|
|
* **Continuous Integration**: before deployment, code should be vetted by
|
|
|
a peer and, ideally, automatically checked for errors and tested
|
|
|
* **single source of truth**: when we add/remove nodes, we should not
|
|
|
have to talk to multiple services (see also the [install automation
|
|
|
ticket](https://gitlab.torproject.org/tpo/tpa/team/-/issues/31239) and the [new-machine discussion](new-machine#discussion)
|
|
|
* **collaboration** with other sysadmins outside of TPA, for which we
|
|
|
would need to...
|
|
|
* ... **publicize our code** (see [ticket 29387](https://gitlab.torproject.org/tpo/tpa/team/-/issues/29387))
|
|
|
* **no manual changes**: every change on every server should be committed
|
|
|
to version control somewhere
|
|
|
* **bare-metal recovery**: it should be possible to recover a service's
|
|
|
*configuration* from a bare Debian install with Puppet (and with
|
|
|
data from the [backup](backup) service of course...)
|
|
|
* **one commit only**: we shouldn't have to commit "twice" to get
|
|
|
changes propagated (once in a submodule, once in the parent module,
|
|
|
for example)
|
|
|
|
|
|
### Non-Goals
|
|
|
|
|
|
TODO.
|
|
|
* **ad hoc changes** to the infrastructure. one-off jobs should be
|
|
|
handled by [fabric](fabric), Cumin, or straight SSH.
|
|
|
|
|
|
## Approvals required
|
|
|
|
... | ... | |