-
anarcat authored
Mostly consistency in capitalization, adding some links to related issues, spell-checking and splitting long sentences. See: team#41948
anarcat authoredMostly consistency in capitalization, adding some links to related issues, spell-checking and splitting long sentences. See: team#41948
tpa-rfc-77-puppet-merge.md 5.24 KiB
title: Puppet merge
deadline: 2025-02-10
status: proposed
Background
TPA-RFC-73 identified Puppet as a bottleneck for the infra merge, as it blocks keeping, migrating and merging several other services. Merging codebases and ditching one of the Puppet servers is a complex move, so in this document we detail how that will be done.
Proposal
Goals
Must have
- One Puppet Server to rule them all
- Adoption of TPA's solution for handling Puppet modules and ENC
- Convergence in Puppet modules versions
- Commit signing (as it's fundamental for Tails' current backup solution)
Non-goals
This proposal is not about:
- Completely refactoring and deduplicating code, as that will be done step-by-step while we handle each services individually after the Puppet Server merge
- Ditching one way to store secrets in favor of another, as that will be done separately in the future, after both teams had the chance to experience Trocla and hiera-eyaml
- Tackling individual service merges, such as backups, dns, monitoring and firewall; these will be tackled individually once all infra is under one Puppet Server
- Applying new code standards everywhere; at most, we'll come up with general guidelines that could (maybe should) be used for new code and, in the future, for refactoring
Phase 1: Codebase preparation
This phase ensures that, once Tails code is copied to Tor's Puppet Control repo:
- Code structure will match and be coherent
- Tails code will not affect Tor's infra and Tor's code will not affect Tails infra
Note: Make sure to freeze all Puppet code refactoring on both sides before starting.
Converge in structure
- Tails:
- Switch from Git submodules to using g10k in a monorepo
- Remove ENC configuration, Tails don't really use it and the Puppet server switch will implement Tor's instead
- Move node definitions under
manifests/nodes.pp
to roles - Switch to the directory structure used by Tor:
- Move custom non-profile modules to
legacy/
(monitoring
,apache
,passenger
), leave only 3rd party modules undermodules/
- Rename
hieradata
todata
- Rename
profiles
tosite
- Move custom non-profile modules to
Converge in substance
- Tails:
- Refactor the legacy
apache
andpassenger
modules out of existence - Rename all profiles from
tails::profile
toprofile::tails
- Ensure all exported resources' tags are prefixed with tails_
- Refactor the legacy
- Tor:
- Install all
3rdparty
modules that are used by Tails but not by Tor - Isolate all exported resources and collectors using tags
- Ensure there is a parameter to disable all 'base' functionality (i.e., nothing gets installed on a puppet node that is not explicitly included in the role)
- Enforce signed commits
- Ensure all private data is moved to Trocla and publish the repo (tpo/tpa/team#29387)
- Install EYAML, copy the EYAML keys from the Tails to the Tor puppet server, and adapt
hiera.yaml
to use them
- Install all
- Tor and Tails:
- Upgrade 3rdparty modules to match versions
Phase 2: Puppet server switch
This phase moves all nodes from one Puppet server to the other:
- Copy
legacy
modules from Tails to Tor - Copy roles and profiles from Tails to Tor
- Assign nodes to roles using the ENC
- Point Tails nodes to the Tor Puppet server
- Retire the Tails' Puppet server
Phase 3: Towards a more homogeneous codebase
This phase paves the way towards a cleaner future:
- One by one, for each profile in
profile::tails
- Move the profile to
profile
(without::tails
), or - Merge the profile with an existing one in
profile
- Move the profile to
- Deduplicate, refactor, cleanup, etc.
- Defining code standards (documentation, linting, pre-commit hooks, etc)
Alternatives considered
- Migrate services to TPA before moving Puppet: some of the Tails services heavily depend on others and/or on the network setup. For example, Jenkins Agents on different machines talk to a Jenkins Orchestrator and a Gitolite server hosted on different VMs, then build nightly ISOs that are copied to the web VM and published over HTTP. Migrating all of these over to TPA's infra would be much more complex than just merging Puppet.