... | @@ -961,7 +961,180 @@ TPA should approve policy changes as per [tpa-rfc-1](/policy/tpa-rfc-1-policy). |
... | @@ -961,7 +961,180 @@ TPA should approve policy changes as per [tpa-rfc-1](/policy/tpa-rfc-1-policy). |
|
|
|
|
|
## Proposed Solution
|
|
## Proposed Solution
|
|
|
|
|
|
N/A.
|
|
To improve on the above "Goals", I would suggest the following
|
|
|
|
configuration.
|
|
|
|
|
|
|
|
TL;DR:
|
|
|
|
|
|
|
|
1. Use a control repository
|
|
|
|
2. Get rid of 3rdparty
|
|
|
|
3. Deploy with g10k
|
|
|
|
4. Authenticate with checksums
|
|
|
|
5. Deploy to branch-specific environments
|
|
|
|
6. Rename the default branch "production"
|
|
|
|
7. Push directly on the Puppet server
|
|
|
|
8. Use a role account
|
|
|
|
9. Use local test environments
|
|
|
|
10. Develop a test suite
|
|
|
|
11. Hook into CI
|
|
|
|
12. OpenPGP verification and web hook
|
|
|
|
|
|
|
|
Steps 1-8 could be implemented without too much difficulty and should
|
|
|
|
be a mid term objective. Steps 9 to 12 require significantly more work
|
|
|
|
and could be implemented once the new infrastructure stabilizes.
|
|
|
|
|
|
|
|
What follows is an explanation and justification of each step.
|
|
|
|
|
|
|
|
### Use a control repository
|
|
|
|
|
|
|
|
The base of the infrastructure is a [control-repo](https://puppet.com/docs/pe/latest/control_repo.html) ([example](https://github.com/puppetlabs/control-repo))
|
|
|
|
which chain-loads all the other modules. This implies turning all our
|
|
|
|
"modules" into "profiles" and moving "real" modules (which are fit for
|
|
|
|
public consumption) "outside", into public repositories (see also
|
|
|
|
[issue 29387: publish our puppet repository](https://gitlab.torproject.org/tpo/tpa/team/-/issues/29387)).
|
|
|
|
|
|
|
|
Note that the control repository *could* also be public: we could
|
|
|
|
simply have the private data inside of Hiera or some other private
|
|
|
|
repository.
|
|
|
|
|
|
|
|
The control repository concept is specific to the proprietary version
|
|
|
|
of Puppet (Puppet Enterprise or PE) but its logic should be usable
|
|
|
|
with the open source Puppet release as well.
|
|
|
|
|
|
|
|
### Get rid of 3rdparty
|
|
|
|
|
|
|
|
The control repo's core configuration file is the `Puppetfile`. We
|
|
|
|
already use a Puppetfile, but only to manage modules inside of the
|
|
|
|
`3rdparty` directory. Now it would manage *all* modules, or, more
|
|
|
|
specifically, `3rdparty` would become the default `modules` directory
|
|
|
|
which would, incidentally, encourage us to upstream our modules and
|
|
|
|
publish them to the world.
|
|
|
|
|
|
|
|
Our current `modules` directory would move into `site-modules`, which
|
|
|
|
is the designated location for "roles, profiles, and custom
|
|
|
|
modules". This has been suggested before in [issue 29387: publish our
|
|
|
|
puppet repository](https://gitlab.torproject.org/tpo/tpa/team/-/issues/29387)) and is important for the `Puppetfile` to do its
|
|
|
|
job.
|
|
|
|
|
|
|
|
### Deploy with g10k
|
|
|
|
|
|
|
|
It seems clear that everyone is converging over the use of a
|
|
|
|
`Puppetfile` to deploy code. While there are still monorepos out
|
|
|
|
there, but they do make our life harder, especially when we need to
|
|
|
|
operate on non-custom modules.
|
|
|
|
|
|
|
|
Instead, we should converge towards *not* following upstream modules
|
|
|
|
in our git repository. Modules managed by the `Puppetfile` would *not*
|
|
|
|
be managed in our git monorepo and, instead, would be deployed by
|
|
|
|
`r10k`.
|
|
|
|
|
|
|
|
### Authenticate code with checksums
|
|
|
|
|
|
|
|
This part is the main problem with moving away from a monorepo. By
|
|
|
|
using a monorepo, we can audit the code we push into production. But
|
|
|
|
if we offload this to `r10k`, it can download code from wherever the
|
|
|
|
`Puppetfile` says, effectively shifting our trust path from OpenSSH
|
|
|
|
to HTTPS, the Puppet Forge, git and whatever remote gets added to the
|
|
|
|
`Puppetfile`.
|
|
|
|
|
|
|
|
There is no obvious solution for this right now, surprisingly. Here
|
|
|
|
are two possible alternatives:
|
|
|
|
|
|
|
|
1. [g10k](https://github.com/xorpaul/g10k/) supports using a `:sha256sum` parameter to checksum
|
|
|
|
modules, but that only works for Forge modules. Maybe we could
|
|
|
|
pair this with using an explicit `sha1` reference for git
|
|
|
|
repository, ensuring those are checksummed as well. The downside
|
|
|
|
of that approach is that it leaves checked out git repositories in
|
|
|
|
a "detached head" state.
|
|
|
|
|
|
|
|
2. `r10k` has a [pending pull request](https://github.com/puppetlabs/r10k/pull/823) to add a `filter_command`
|
|
|
|
directive which could run after a git checkout has been
|
|
|
|
performed. it could presumably be used to verify OpenPGP
|
|
|
|
signatures on git commits, although this would work only on
|
|
|
|
modules we sign commits on (and therefore not third party)
|
|
|
|
|
|
|
|
It seems the best approach would be to use g10k for now with checksums
|
|
|
|
on both git commit and forge modules.
|
|
|
|
|
|
|
|
A validation hook running *before* g10k COULD validate that all `mod`
|
|
|
|
lines have a `checksum` of some sort...
|
|
|
|
|
|
|
|
Note that this approach does *NOT* solve the "double-commit" problem
|
|
|
|
identified in the Goals. It is believed that only a "monorepo" would
|
|
|
|
fix that problem and that approach comes in direct conflict with the
|
|
|
|
"collaboration" requirement. We chose the latter.
|
|
|
|
|
|
|
|
### Deploy to branch-specific environments
|
|
|
|
|
|
|
|
A key feature of r10k (and, of course, g10k) is that they are capable
|
|
|
|
of deploying code to new environments depending on the branch we're
|
|
|
|
working on. We would enable that feature to allow testing some large
|
|
|
|
changes to critical code paths without affecting all servers.
|
|
|
|
|
|
|
|
### Rename the default branch "production"
|
|
|
|
|
|
|
|
In accordance with Puppet's best practices, the control repository's
|
|
|
|
default branch would be called "production" and not "master".
|
|
|
|
|
|
|
|
Also: Black Lives Matter.
|
|
|
|
|
|
|
|
### Push directly on the Puppet server
|
|
|
|
|
|
|
|
Because we are worried about the GitLab attack surface, we could still
|
|
|
|
keep on pushing to the Puppet server for now. The control repository
|
|
|
|
could be mirrored to GitLab using a deploy key. All other repositories
|
|
|
|
would be published on GitLab anyways, and there the attack surface
|
|
|
|
would not matter because of the checksums in the control repository.
|
|
|
|
|
|
|
|
### Use a role account
|
|
|
|
|
|
|
|
To avoid permission issues, use a role account (say `git`) to accept
|
|
|
|
pushes and enforce git hooks.
|
|
|
|
|
|
|
|
### Use local test environments
|
|
|
|
|
|
|
|
It should eventually be possible to test changes locally before
|
|
|
|
pushing to production. This would involve radically simplifying the
|
|
|
|
Puppet server configuration and probably either getting rid of the
|
|
|
|
LDAP integration or at least making it optional so that changes can be
|
|
|
|
tested without it.
|
|
|
|
|
|
|
|
This would involve "puppetizing" the Puppet server configuration so
|
|
|
|
that a Puppet server and test agent(s) could be bootstrapped
|
|
|
|
automatically. Operators would run "smoke tests" (running Puppet by
|
|
|
|
hand and looking at the result) to make sure their code works before
|
|
|
|
pushing to production.
|
|
|
|
|
|
|
|
### Develop a test suite
|
|
|
|
|
|
|
|
The next step is to start working on a test suite for services, at
|
|
|
|
least for new deployments, so that code can be tested without running
|
|
|
|
things by hand. Plenty of Puppet modules have such test suite,
|
|
|
|
generally using [rspec-puppet](https://rspec-puppet.com/) and [rspec-puppet-facts](https://github.com/mcanevet/rspec-puppet-facts), and we
|
|
|
|
already have a few modules in `3rdparty` that have such tests. The
|
|
|
|
idea would be to have those tests on a per-role or per-profile basis.
|
|
|
|
|
|
|
|
### Hook into continuous integration
|
|
|
|
|
|
|
|
Once tests are functional, the last step is to move the control
|
|
|
|
repository into GitLab directly and start running CI against the
|
|
|
|
Puppet code base. This would probably not happen until GitLab CI is
|
|
|
|
deployed, and would require lots of work to get there, but would
|
|
|
|
eventually be worth it.
|
|
|
|
|
|
|
|
The GitLab CI would be indicative: an operator would need to push to a
|
|
|
|
topic branch there first to confirm tests pass but would still push
|
|
|
|
directly to the Puppet server for production.
|
|
|
|
|
|
|
|
### OpenPGP verification and web hook
|
|
|
|
|
|
|
|
To stop pushing directly to the Puppet server, we could implement
|
|
|
|
OpenPGP verification on the control repository. If a hook checks that
|
|
|
|
commits are signed by a trusted party, it does not matter where the
|
|
|
|
code is hosted.
|
|
|
|
|
|
|
|
We could use the [webhook](https://github.com/voxpupuli/puppet_webhook) system to have GitLab notify the Puppet
|
|
|
|
server to pull code.
|
|
|
|
|
|
## Cost
|
|
## Cost
|
|
|
|
|
... | @@ -1125,3 +1298,52 @@ specific remotes in subdirectories of the monorepo automatically. |
... | @@ -1125,3 +1298,52 @@ specific remotes in subdirectories of the monorepo automatically. |
|
| Subtree | "best of both worlds" | Still get double-commit, rebase problems | Not sure it's worth it |
|
|
| Subtree | "best of both worlds" | Still get double-commit, rebase problems | Not sure it's worth it |
|
|
| Subrepo | ? | ? | ? |
|
|
| Subrepo | ? | ? | ? |
|
|
| myrepos | Flexible | Esoteric | might be useful with our monorepo |
|
|
| myrepos | Flexible | Esoteric | might be useful with our monorepo |
|
|
|
|
|
|
|
|
### Best practices survey
|
|
|
|
|
|
|
|
I made a survey of the community (mostly the [shared puppet
|
|
|
|
modules](https://gitlab.com/shared-puppet-modules-group/) and [Voxpupuli](https://voxpupuli.org/) groups) to find out what the best
|
|
|
|
current practices are.
|
|
|
|
|
|
|
|
Koumbit uses foreman/puppet but pinned at version 10.1 because it is
|
|
|
|
the last one supporting "passenger" (the puppetmaster deployment
|
|
|
|
method currently available in Debian, deprecated and dropped from
|
|
|
|
puppet 6). They [patched it](https://redmine.koumbit.net/projects/theforeman-puppet/repository/revisions/5b1b0b42f2d7d7b01eacde6584d3) to support `puppetlabs/apache < 6`.
|
|
|
|
They push to a bare repo on the puppet master, then they have
|
|
|
|
validation hooks (the inspiration for our #31226), and a hook deploys
|
|
|
|
the code to the right branch.
|
|
|
|
|
|
|
|
They were using r10k but stopped because they had issues when r10k
|
|
|
|
would fail to deploy code atomically, leaving the puppetmaster (and
|
|
|
|
all nodes!) in an unusable state. This would happen when their git
|
|
|
|
servers were down without a locally cached copy. They also implemented
|
|
|
|
branch cleanup on deletion (although that could have been done some
|
|
|
|
other way). That issue was apparently reported against r10k but never
|
|
|
|
got a response. They now use puppet-librarian in their custom
|
|
|
|
hook. Note that it's possible r10k does not actually have that issue
|
|
|
|
because they found the issue they filed and it was... [against
|
|
|
|
librarian](https://github.com/voxpupuli/librarian-puppet/issues/73)!
|
|
|
|
|
|
|
|
Some people in #voxpupuli seem to use the Puppetlabs Debian packages
|
|
|
|
and therefore puppetserver, r10k and puppetboards. Their [Monolithic
|
|
|
|
master](https://voxpupuli.org/docs/monolithic/) architecture uses an external git repository, which pings
|
|
|
|
the puppetmaster through a [webhook](https://github.com/voxpupuli/puppet_webhook) which deploys a
|
|
|
|
[control-repo](https://puppet.com/docs/pe/latest/control_repo.html) ([example](https://github.com/puppetlabs/control-repo)) and calls r10k to deploy the
|
|
|
|
code. They also use [foreman](https://www.theforeman.org/) as a node classifier. that procedure
|
|
|
|
uses the following modules:
|
|
|
|
|
|
|
|
* [puppet/puppetserver](https://forge.puppet.com/puppet/puppetserver)
|
|
|
|
* [puppetlabs/puppet_agent](https://forge.puppet.com/puppetlabs/puppet_agent)
|
|
|
|
* [puppetlabs/puppetdb](https://forge.puppet.com/puppetlabs/puppetdb)
|
|
|
|
* [puppetlabs/puppet_metrics_dashboard](https://forge.puppet.com/puppetlabs/puppet_metrics_dashboard)
|
|
|
|
* [voxpupuli/puppet_webhook](https://github.com/voxpupuli/puppet_webhook)
|
|
|
|
* [r10k](https://github.com/puppetlabs/r10k) or [g10k](https://github.com/xorpaul/g10k)
|
|
|
|
* [Foreman](https://www.theforeman.org/)
|
|
|
|
|
|
|
|
They also have a [master of masters](https://voxpupuli.org/docs/master_agent/) architecture for scaling to
|
|
|
|
larger setups. For scaling, I have found [this article](https://puppet.com/blog/scaling-open-source-puppet/) to be more
|
|
|
|
interesting, that said.
|
|
|
|
|
|
|
|
So, in short, it seems people are converging towards r10k with a
|
|
|
|
web hook. To validate git repositories, they mirror the repositories
|
|
|
|
to a private git host. |