Newer
Older
TPA uses [Puppet](https://puppet.com/) to manage all servers it operates. It handles
most of the configuration management of the base operating system and
some services. It is *not* designed to handle ad-hoc tasks, for which
This page is long! This first section hopes to get
you running with a simple task quickly.
## Adding an "message of the day" (motd) on a server
To post announcements to shell users of a servers, it might be a good
idea to post a "message of the day" (`/etc/motd`) that will show up on
login. Good examples are known issues, maintenance windows, or service
retirements.
This change should be fairly inoffensive because it should affect only
a single server, and only the `motd`, so the worst that can happen
here is a silly motd gets displayed (or nothing at all).
Here is how to make the change:
1. To any change on the Puppet server, you will first need to clone
the git repository:
git clone git@puppet.torproject.org:/srv/puppet.torproject.org/git/tor-puppet
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
This needs to be only done once.
2. the messages are managed by the `motd` module, but to easily add
an "extra" entry, you should had to the Hiera data storage for the
specific host you want to modify. Let's say you want to add a
`motd` on `perdulce`, the current `people.torproject.org`
server. The file you will need to change (or create!) is
`hiera/nodes/perdulce.torproject.org.yaml`:
$EDITOR hiera/nodes/perdulce.torproject.org.yaml
3. Hiera stores data in YAML. So you need to create a little YAML
snippet, like this:
motd::extra: |
Hello world!
4. Then you can commit this and *push*:
git commit -m"add a nice friendly message to the motd" && git push
5. Then you should login to the host and make sure the code applies
correctly, in dry-run mode:
ssh -tt perdulce.torproject.org sudo puppet agent -t --noop
6. If that works, you can do it for real:
ssh -tt perdulce.torproject.org sudo puppet agent -t
On next login, you should see your friendly new message. Do not forget
to revert the change!
The next tutorial is about a more elaborate change, performed on
multiple servers.
In this tutorial, we will add an IP address to the global allow list,
on all firewalls on all machines. This is a big deal! It will allow
that IP address to access the SSH servers on all boxes and more. This
should be an **static** IP address on a trusted network.
If you have never used Puppet before or are nervous at all
about making such a change, it is a good idea to have a more
experienced sysadmin nearby to help you. They can
also confirm this tutorial is what is actually needed.
1. To any change on the Puppet server, you will first need to clone
the git repository:
git clone git@puppet.torproject.org:/srv/puppet.torproject.org/git/tor-puppet
This needs to be only done once.
2. The firewall rules are defined in the `ferm` module, which lives
in `modules/ferm`. The file you specifically need to change is
`modules/ferm/templates/defs.conf.erb`, so open that in your
editor of choice:
$EDITOR modules/ferm/templates/defs.conf.erb
3. The code you are looking for is `ADMIN_IPS`. Add a `@def` for your
IP address and add the new macro to the `ADMIN_IPS` macro. When
you exit your editor, git should show you a diff that looks
something like this:
--- a/modules/ferm/templates/defs.conf.erb
+++ b/modules/ferm/templates/defs.conf.erb
@@ -77,7 +77,10 @@ def $TPO_NET = (<%= networks.join(' ') %>);
@def $linus = ();
@def $linus = ($linus 193.10.5.2/32); # kcmp@adbc
@def $linus = ($linus 2001:6b0:8::2/128); # kcmp@adbc
-@def $ADMIN_IPS = ($weasel $linus);
+@def $anarcat = ();
+@def $anarcat = ($anarcat 203.0.113.1/32); # home IP
+@def $anarcat = ($anarcat 2001:DB8::DEAD/128 2001:DB8:F00F::/56); # home IPv6
@def $BASE_SSH_ALLOWED = ();
4. Then you can commit this and *push*:
git commit -m'add my home address to the allow list' && git push
5. Then you should login to one of the hosts and make sure the code
applies correctly:
ssh -tt perdulce.torproject.org sudo puppet agent -t
Puppet shows colorful messages. If nothing is red and it returns
correctly, you are done. If that doesn't work, go back to step 2. If
that doesn't work, ask for help from your colleague in the Tor
sysadmin team.
If this works, congratulations, you have made your first change across
the entire Puppet infrastructure! You might want to look at the rest
of the documentation to learn more about how to do different tasks and
how things are setup. A key "How to" we recommend is the `Progressive
deployment` section below, which will teach you how to make a change
like the above while making sure you don't break anything even if it
affects a lot of machines.
### Using environments
During ordinary maintenance operations, it's appropriate to work directly on the
default `production` branch, which deploys to the `production` environment.
However, for more complex changes, such as when deploying a new
service or adding a module (see below), it's recommended to start by
working on a feature branch which will deploy as a distinct
[environment](#environments) on the Puppet server.
To quickly test a different environment used, you can switch the
environment used by the Puppet agent using the `--environment`
flag. For example, this will switch a node from `production` to
`test`:
puppet agent --test --environment test
Note that this setting is **sticky**: further runs will *keep* the
`test` environment even if the `--environment` flag is not set, as the
setting is written in the `puppet.conf`. To reset to the `production`
environment, you can simply use that flag again:
puppet agent --test --environment test
A node or group of nodes can be switch to a different environment
using the [external node classifier](#external-node-classifier-enc).
Once the feature branch is satisfactory, it can then be merged to
git merge test
git branch -d test
git push -d origin test
Branches are not deleted automatically after merge: make sure you
cleanup after yourself.
Because environments aren't totally isolated from each other and a compromised
node could choose to apply an environment other than `production`, care should
be taken with the code pushed to these feature branches. It's recommended to
avoid overly broad debugging statements, if any, and to generally keep an
active eye on feature branches so as to prevent the accumulation of unreviewed
code.
Finally, note that environments are automatically destroyed (alongside
their branch) on the Puppet server after 2 weeks since the last
commit to the branch. An email warning about this will be sent to the
author of that last commit. This doesn't destroy the mirrored branch
on GitLab.
When an environment is removed, Puppet agents will revert back to the
`production` environment automatically.
### Modifying an existing configuration
For new deployments, this is *NOT* the preferred method. For example,
if you are deploying new software that is not already in use in our
infrastructure, do *not* follow this guide and instead follow the
If you are touching an *existing* configuration, things are much
simpler however: you simply go to the module where the code already
exists and make changes. You `git commit` and `git push` the code,
then immediately run `puppet agent -t` on the affected node.
Look at the `File layout` section above to find the right piece of
code to modify. If you are making changes that potentially affect more
than one host, you should also definitely look at the `Progressive
deployment` section below.
This is a broad topic, but let's take the Prometheus monitoring system
as an example which followed the [role/profile/module][]
pattern.
First, the [Prometheus modules on the Puppet forge][] were evaluated
for quality and popularity. There was a clear winner there: the
[Prometheus module][] from [Vox Populi][] had hundreds of thousands
more downloads than the [next option][], which was deprecated.
[next option]: https://forge.puppet.com/brutus777/prometheus
[Vox Populi]: https://voxpupuli.org/
[Prometheus module]: https://forge.puppet.com/puppet/prometheus
[Prometheus modules on the Puppet forge]: https://forge.puppet.com/modules?q=prometheus
Next, the module was added to the Puppetfile (in
mod 'puppet/prometheus', # 12.5.0
:git => 'https://github.com/voxpupuli/puppet-prometheus.git',
:commit => '25dd701b489fc32c892390fd464e765ebd6f513a' # tag: v12.5.0
- Since tpo/tpa/team#41974 we don't import 3rd-party code into our repo and
- Because of that, modules in the `Puppetfile` should always be pinned to a Git
repo and commit, as that's currently the simplest way to avoid some MITM
issues.
- We currently don't have an automated way of managing module dependencies, so
you'll have to manually and recursively add dependencies to the `Puppetfile`.
Sorry!
- Make sure to manually audit the code for each module, by reading each file
and looking for obvious security flaws or back doors.
Then the code was committed into git:
git add Puppetfile
git commit -m'install prometheus module and its dependencies after audit'
Then the module was configured in a profile, in `modules/profile/manifests/prometheus/server.pp`:
class profile::prometheus::server {
class {
'prometheus::server':
# follow prom2 defaults
localstorage => '/var/lib/prometheus/metrics2',
storage_retention => '15d',
}
}
The above contains our local configuration for the upstream
`prometheus::server` class. In
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
particular, it sets a retention period and a different path for the
metrics, so that they follow the new Prometheus 2.x defaults.
Then this profile was added to a *role*, in
`modules/roles/manifests/monitoring.pp`:
# the monitoring server
class roles::monitoring {
include profile::prometheus::server
}
Notice how the role does not refer to any implementation detail, like
that the monitoring server uses Prometheus. It looks like a trivial,
useless, class but it can actually grow to include *multiple*
profiles.
Then that role is added to the Hiera configuration of the monitoring
server, in `hiera/nodes/hetzner-nbg1-01.torproject.org.yaml`:
classes:
- roles::monitoring
And Puppet was ran on the host, with:
puppet --enable ; puppet agent -t --noop ; puppet --disable "testing prometheus deployment"
If you need to deploy the code to multiple hosts, see the `Progressive
deployment` section below. To contribute changes back upstream (and
you should do so), see the section right below.
### Contributing changes back upstream
Fork the upstream repository and operate on your fork until the changes are
eventually merged upstream.
Then, update the `Puppetfile`, for example:
The module is then forked on GitHub or wherever it is hosted
mod 'puppet-prometheus',
:git => 'https://github.com/anarcat/puppet-prometheus.git',
Note that the `deploy` branch here is a merge of all the different
branches proposed upstream in different pull requests, but it could
also be the `master` branch or a single branch if only a single pull
request was sent.
You'll have to keep a clone of the upstream repository somewhere outside of the
`tor-puppet` work tree, from which you can push and pull normally with
upstream. When you make a change, you need to commit (and push) the change in
your external clone and update the `Puppetfile` in the repository.
Ideally, Puppet modules have a test suite. This is done with
[rspec-puppet](https://rspec-puppet.com/) and [rspec-puppet-facts](https://github.com/mcanevet/rspec-puppet-facts). This is not very well
documented upstream, but it's apparently part of the [Puppet
Development Kit](https://puppet.com/docs/pdk/1.x/pdk.html) (PDK). Anyways: assuming tests exists, you will
want to run some tests before pushing your code upstream, or at least
upstream might ask you for this before accepting your changes. Here's
how to get setup:
sudo apt install ruby-rspec-puppet ruby-puppetlabs-spec-helper ruby-bundler
bundle install --path vendor/bundle
This installs some basic libraries, system-wide (Ruby bundler and the
rspec stuff). Unfortunately, required Ruby code is rarely all present
in Debian and you still need to install extra gems. In this case we
set it up within the `vendor/bundle` directory to isolate them from
the global search path.
Finally, to run the tests, you need to wrap your invocation with
`bundle exec`, like so:
You SHOULD run validation checks on commit locally before pushing your
manifests. To install those hooks, you should clone this repository:
git clone https://github.com/anarcat/puppet-git-hooks
... and deploy it as a pre-commit hook:
ln -s $PWD/puppet-git-hooks/pre-commit tor-puppet/.git/hooks/pre-commit
This hook is deployed on the server and *will* refuse your push if it
fails linting, see [issue 31226][] for a discussion.
If you need to set a password in a manifest, there are special
functions to handle this. We do not want to store passwords directly
in Puppet source code, for various reasons: it is hard to erase
because code is stored in git, but also, ultimately, we want to
publish that source code publicly.
We use [Trocla][] for this purpose, which generates
random passwords and stores the hash or, if necessary, the clear-text
in a YAML file.
[Trocla]: https://github.com/duritong/trocla
Trocla's man page is not very useful, but you can see a list of subcommands in
the [project's README file][].
[project's README file]: https://github.com/duritong/trocla
With Trocla, each password is generated on the fly from a secure
entropy source ([Ruby's SecureRandom module][]) and stored inside a
state file (in `/var/lib/trocla/trocla_data.yml`, configured
`/etc/puppet/troclarc.yaml`) on the Puppet master.
Trocla can return "hashed" versions of the passwords, so that the
plain text password is never visible from the client. The plain text
can still be stored on the Puppet master, or it can be deleted once
it's been transmitted to the user or another password manager. This
makes it possible to have Trocla not keep any secret at all.
[Ruby's SecureRandom module]: https://ruby-doc.org/stdlib-1.9.3/libdoc/securerandom/rdoc/SecureRandom.html
This piece of code will generate a [bcrypt][]-hashed password for the
Grafana admin, for example:
$grafana_admin_password = trocla('grafana_admin_password', 'bcrypt')
The plain-text for that password will never leave the Puppet master. it
will still be stored on the Puppet master, and you can see the value
with:
trocla get grafana_admin_password plain
[bcrypt]: https://en.wikipedia.org/wiki/Bcrypt
A password can also be set with this command:
trocla set grafana_guest_password plain
Note that this might *erase* other formats for this password, although
those will get regenerated as needed.
Also note that `trocla get` will fail if the particular password or
format requested does not exist. For example, say you generate a
plain-text password with and then get the `bcrypt` version:
trocla create test plain
trocla get test bcrypt
This will return the empty string instead of the hashed
version. Instead, use `trocla create` to generate that password. In
general, it's safe to use `trocla create` as it will reuse existing
password. It's actually how the `trocla()` function behaves in Puppet
as well.
TODO: Trocla can provide passwords to classes transparently, without
having to do function calls inside Puppet manifests. For example, this
code:
class profile::grafana {
$password = trocla('profile::grafana::password', 'plain')
# ...
}
class profile::grafana(String $password) {
# ...
}
1. Trocla needs to be included in Hiera
2. We need roles to be more clearly defined in Hiera, and use Hiera
as an ENC so that we can do per-roles passwords (for example),
which is not currently possible.
### Getting information from other nodes
A common pattern in Puppet is to deploy resources on a given host with
information from another host. For example, you might want to grant
access to host A from host B. And while you can hardcode host B's IP
address in host A's manifest, it's not good practice: if host B's IP
address changes, you need to change the manifest, and that practice
makes it difficult to introduce host C into the pool...
So we need ways of having a node use information from other nodes in
our Puppet manifests. There are 5 methods in our Puppet source code at
the time of writing:
* Exported resources
* PuppetDB lookups
* Puppet Query Language (PQL)
* LDAP lookups
* Hiera lookups
This section walks through how each method works, outlining the
advantage/disadvantage of each.
Our Puppet configuration supports [exported resources](https://puppet.com/docs/puppet/latest/lang_exported.html), a key
component of complex Puppet deployments. Exported resources allow one
host to define a configuration that will be *exported* to the Puppet
server and then *realized* on another host.
These exported resources are not confined by environments: for example,
resources exported by a node assigned to the `foo` environment will be
available on all resources of the `production` environment, and vice-versa.
We commonly use this to punch holes in the firewall between nodes. For
example, this manifest in the `roles::puppetmaster` class:
@@ferm::rule::simple { "roles::puppetmaster-${::fqdn}":
tag => 'roles::puppetmaster',
description => 'Allow Puppetmaster access to LDAP',
port => ['ldap', 'ldaps'],
saddr => $base::public_addresses,
}
... exports a firewall rule that will, later, allow the Puppet server
to access the LDAP server (hence the `port => ['ldap', 'ldaps']`
line). This rule doesn't take effect on the host applying the
`roles::puppetmaster` class, but only on the LDAP server, through this
rather exotic syntax:
Ferm::Rule::Simple <<| tag == 'roles::puppetmaster' |>>
This tells the LDAP server to apply whatever rule was exported with
the `@@` syntax and the specified `tag`. Any Puppet resource can be
exported and realized that way.
Note that there are security implications with collecting exported
resources: it delegates the resource specification of a node to
another. So, in the above scenario, the Puppet master could decide to
open *other* ports on the LDAP server (say, the SSH port), because it
exports the port number and the LDAP server just blindly applies the
directive. A more secure specification would explicitly specify the
sensitive information, like so:
Ferm::Rule::Simple <<| tag == 'roles::puppetmaster' |>> {
port => ['ldap'],
}
But then a compromised server could send a different `saddr` and
there's nothing the LDAP server could do here: it cannot override the
address because it's exactly the information we need from the other
server...
A common pattern in Puppet is to extract information from host A and
use it on host B. The above "exported resources" pattern can do this
for files, commands and many more resources, but sometimes we just
want a tiny bit of information to embed in a configuration file. This
could, in theory, be done with an exported [concat](https://forge.puppet.com/puppetlabs/concat) resource, but
this can become prohibitively complicated for something as simple as
an allowed IP address in a configuration file.
For this we use the [puppetdbquery module](https://github.com/dalen/puppet-puppetdbquery), which allows us to do
elegant queries against PuppetDB. For example, this will extract the
IP addresses of all nodes with the `roles::gitlab` class applied:
$allow_ipv4 = query_nodes('Class[roles::gitlab]', 'networking.ip')
$allow_ipv6 = query_nodes('Class[roles::gitlab]', 'networking.ip6')
This code, in `profile::kgb_bot`, propagates those variables into a
template through the `allowed_addresses` variable, which gets expanded
like this:
<% if $allow_addresses { -%>
<% $allow_addresses.each |String $address| { -%>
allow <%= $address %>;
<% } -%>
deny all;
<% } -%>
Note that there is a potential security issue with that approach. The
same way that exported resources trust the exporter, we trust that the
node exported the right fact. So it's in theory possible that a
compromised Puppet node exports an evil IP address in the above
example, granting access to an attacker instead of the proper node. If
that is a concern, consider using LDAP or Hiera lookups instead.
Also note that this will eventually fail when the node goes down:
after a while, resources are expired from the PuppetDB server and the
above query will return an empty list. This seems reasonable: we do
want to eventually revoke access to nodes that go away, but it's still
something to keep in mind.
Keep in mind that the `networking.ip` fact, in the above example,
might be incorrect in the case of a host that's behind NAT. In that
case, you should use LDAP or Hiera lookups.
Note that this could also be implemented with a `concat` exported
resource, but much harder because you would need some special case
when no resource is exported (to avoid adding the `deny`) and take
into account that other configurations might also be needed in the
file. It would have the same security and expiry issues anyways.
Note that there's also a way to do those queries without a Forge
module, through the [Puppet query language](https://puppet.com/docs/puppetdb/5.2/api/query/tutorial-pql.html) and the
`puppetdb_query` function. The problem with that approach is that the
function is not very well documented and the query syntax is somewhat
obtuse. For example, this is what I came up with to do the equivalent
of the `query_nodes` call, above:
$allow_ipv4 = puppetdb_query(
['from', 'facts',
['and',
['=', 'name', 'networking.ip'],
['in', 'certname',
['extract', 'certname',
['select_resources',
['and',
['=', 'type', 'Class'],
['=', 'title', 'roles::gitlab']]]]]]])
It seems like I did something wrong, because that returned an empty
array. I could not figure out how to debug this, and apparently I
needed more functions (like `map` and `filter`) to get what I wanted
(see [this gist](https://gist.github.com/bastelfreak/b9620fa1892ebcc659c442b115db34f9)). I gave up at that point: the `puppetdbquery`
abstraction is much cleaner and usable.
If you are merely looking for a hostname, however, PQL might be a
little more manageable. For example, this is how the
`roles::onionoo_frontend` class finds its backends to setup the
[IPsec](ipsec) network:
$query = 'nodes[certname] { resources { type = "Class" and title = "Roles::Onionoo_backend" } }'
$peer_names = sort(puppetdb_query($query).map |$value| { $value["certname"] })
$peer_names.each |$peer_name| {
$network_tag = [$::fqdn, $peer_name].sort().join('::')
ipsec::network { "ipsec::${network_tag}":
peer_networks => $base::public_addresses
}
}
Note that Voxpupuli has a helpful [list of Puppet Query Language
examples](https://voxpupuli.org/docs/pql_queries/) as well. Those are based on the [puppet query](https://www.puppet.com/docs/puppetdb/8/pdb_client_tools.html) command
line tool, but it gives good examples of possible queries that can be
used in manifests as well.
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
#### LDAP lookups
Our Puppet server is hooked up to the LDAP server and has information
about the hosts defined there. Information about the node running the
manifest is available in the global `$nodeinfo` variable, but there is
also a `$allnodeinfo` parameter with information about every host
known in LDAP.
A simple example of how to use the `$nodeinfo` variable is how the
`base::public_address` and `base::public_address6` parameters -- which
represent the IPv4 and IPv6 public address of a node -- are
initialized in the `base` class:
class base(
Stdlib::IP::Address $public_address = filter_ipv4(getfromhash($nodeinfo, 'ldap', 'ipHostNumber'))[0],
Optional[Stdlib::IP::Address] $public_address6 = filter_ipv6(getfromhash($nodeinfo, 'ldap', 'ipHostNumber'))[0],
) {
$public_addresses = [ $public_address, $public_address6 ].filter |$addr| { $addr != undef }
}
This loads the `ipHostNumber` field from the `$nodeinfo` variable, and
uses the `filter_ipv4` or `filter_ipv6` functions to extract the IPv4
or IPv6 addresses respectively.
A good example of the `$allnodeinfo` parameter is how the
`roles::onionoo_frontend` class finds the IP addresses of its
backend. After having loaded the host list from PuppetDB, it then uses
the parameter to extract the IP address:
$backends = $peer_names.map |$name| {
[
$name,
$allnodeinfo[$name]['ipHostNumber'].filter |$a| { $a =~ Stdlib::IP::Address::V4 }[0]
] }.convert_to(Hash)
Such a lookup is considered more secure than going through PuppetDB as
LDAP is a trusted data source. It is also our source of truth for this
data, at the time of writing.
#### Hiera lookups
For more security-sensitive data, we should use a trusted data source
to extract information about hosts. We do this through Hiera lookups,
with the [lookup](https://puppet.com/docs/puppet/latest/function.html#lookup) function. A good example is how we populate the
SSH public keys on all hosts, for the admin user. In the
`profile::ssh` class, we do the following:
$keys = lookup('profile::admins::keys', Data, 'hash')
This will lookup the `profile::admin::keys` field in Hiera, which is a
trusted source because under the control of the Puppet git repo. This
refers to the following data structure in `hiera/common.yaml`:
profile::admins::keys:
anarcat:
type: "ssh-rsa"
pubkey: "AAAAB3[...]"
The key point with Hiera is that it's a "hierarchical" data structure,
so each host can have its own override. So in theory, the above keys
could be overridden per host. Similarly, the IP address information for
each host could be stored in Hiera instead of LDAP. But in practice,
we do not currently do this and the per-host information is limited.
### Revoking and generating a new certificate for a host
Revocation procedures problems were discussed in [33587][] and [33446][].
[33587]: https://bugs.torproject.org/33587
[33446]: https://gitlab.torproject.org/legacy/trac/-/issues/33446#note_2349434
1. Clean the certificate on the master
puppet cert clean host.torproject.org
2. Clean the certificate on the client:
find /var/lib/puppet/ssl -name host.torproject.org.pem -delete
3. Then run the bootstrap script on the client from
`fabric-tasks/installer/puppet-bootstrap-client` and get a new checksum
4. Run `tpa-puppet-sign-client` on the master and pass the checksum
5. Run `puppet agent -t` to have puppet running on the client again.
### Generating a batch of resources from Hiera
Say you have a class (let's call it `sbuild::qemu`) and you want it to
generate some resources from a class parameter (and, by extension,
Hiera). Let's call those parameters `sbuild::qemu::image`. How do we
do this?
The simplest way is to just use the `.each` construct and iterate over
each parameter from the class:
# configure a qemu sbuilder
class sbuild::qemu (
Hash[String, Hash] $images = { 'unstable' => {}, },
) {
include sbuild
package { 'sbuild-qemu':
ensure => 'installed',
}
$images.each |$image, $values| {
sbuild::qemu::image { $image: * => $values }
}
}
```
That will create, by default, an `unstable` image with the default
parameters defined in `sbuild::qemu::image`. Some parameters could be
set by default there as well, for example:
$images.each |$image, $values| {
$_values = $values + {
override => "foo",
}
sbuild::qemu::image { $image: * => $_values }
}
```
Going beyond that allows for pretty complicated rules including
validation and so on, for example if the data comes from an untrusted
YAML file. See this [immerda snippet](https://code.immerda.ch/immerda/puppet-modules/webhosting/-/blob/6514c36043679d0ddbb49e2cdd237d921146feeb/manifests/common.pp#L428-463) for an example.
### Quickly restore a file from the filebucket
When Puppet changes or deletes a file, a backup is automatically done locally.
```
Info: Computing checksum on file /etc/subuid
Info: /Stage[main]/Profile::User_namespaces/File[/etc/subuid]: Filebucketed /etc/subuid to puppet with sum 3e8e6d9a252f21f9f5008ebff266c6ed
Notice: /Stage[main]/Profile::User_namespaces/File[/etc/subuid]/ensure: removed
```
To revert this file at its original location, note the hash sum and run this on the system:
puppet filebucket --local restore /etc/subuid 3e8e6d9a252f21f9f5008ebff266c6ed
A different path may be specified to restore it to another location.
## Deployments
### Listing all hosts under puppet
This will list all active hosts known to the Puppet master:
ssh -t puppetdb-01.torproject.org 'sudo -u postgres psql puppetdb -P pager=off -A -t -c "SELECT c.certname FROM certnames c WHERE c.deactivated IS NULL"'
The following will list all hosts under Puppet and their `virtual`
value:
ssh -t puppetdb-01.torproject.org "sudo -u postgres psql puppetdb -P pager=off -F',' -A -t -c \"SELECT c.certname, value_string FROM factsets fs INNER JOIN facts f ON f.factset_id = fs.id INNER JOIN fact_values fv ON fv.id = f.fact_value_id INNER JOIN fact_paths fp ON fp.id = f.fact_path_id INNER JOIN certnames c ON c.certname = fs.certname WHERE fp.name = 'virtual' AND c.deactivated IS NULL\"" | tee hosts.csv
The resulting file is a Comma-Separated Value (CSV) file which can be
used for other purposes later.
Possible values of the `virtual` field can be obtain with a similar
query:
ssh -t puppetdb-01.torproject.org "sudo -u postgres psql puppetdb -P pager=off -A -t -c \"SELECT DISTINCT value_string FROM factsets fs INNER JOIN facts f ON f.factset_id = fs.id INNER JOIN fact_values fv ON fv.id = f.fact_value_id INNER JOIN fact_paths fp ON fp.id = f.fact_path_id WHERE fp.name = 'virtual';\""
The currently known values are: `kvm`, `physical`, and `xenu`.
### Other ways of extracting a host list
* Using the [PuppetDB API][]:
curl -s -G http://localhost:8080/pdb/query/v4/facts | jq -r ".[].certname"
The [fact API][] is quite extensive and allows for very complex
queries. For example, this shows all hosts with the `apache2` fact
set to `true`:
curl -s -G http://localhost:8080/pdb/query/v4/facts --data-urlencode 'query=["and", ["=", "name", "apache2"], ["=", "value", true]]' | jq -r ".[].certname"
This will list all hosts sorted by their report date, older first,
followed by the timestamp, space-separated:
curl -s -G http://localhost:8080/pdb/query/v4/nodes | jq -r 'sort_by(.report_timestamp) | .[] | "\(.certname) \(.report_timestamp)"' | column -s\ -t
This will list all hosts with the `roles::static_mirror` class:
curl -s -G http://localhost:8080/pdb/query/v4 --data-urlencode 'query=inventory[certname] { resources { type = "Class" and title = "Roles::Static_mirror" }} ' | jq -r ".[].certname"
curl -s -G http://localhost:8080/pdb/query/v4 --data-urlencode 'query=inventory[certname] { facts.os.distro.codename = "bookworm" }' | jq -r ".[].certname"
This will show the number of hosts per Debian release:
curl -s -G http://localhost:8080/pdb/query/v4/fact-contents --data-urlencode 'query=["extract", [["function","count"],"value"], ["=","path",["os","distro","codename"]], ["group_by", "value"]]' | jq
* Using [howto/cumin](howto/cumin)
* Using LDAP:
ldapsearch -H ldap://db.torproject.org -x -ZZ -b "ou=hosts,dc=torproject,dc=org" '*' hostname | sed -n '/hostname/{s/hostname: //;p}' | sort
Same, but only hosts not in a Ganeti cluster:
ldapsearch -H ldap://db.torproject.org -x -ZZ -b "ou=hosts,dc=torproject,dc=org" '(!(physicalHost=gnt-*))' hostname | sed -n '/hostname/{s/hostname: //;p}' | sort
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
[PuppetDB API]: https://puppet.com/docs/puppetdb/4.3/api/index.html
[fact API]: https://puppet.com/docs/puppetdb/4.3/api/query/v4/facts.html
### Running Puppet everywhere
There are many ways to [run a command on all hosts (see next
section)][], but the TL;DR: is to basically use [cumin](howto/cumin)
and run this command:
[run a command on all hosts (see next section)]: #batch-jobs-on-all-hosts
cumin -o txt -b 5 '*' 'puppet agent -t'
But before doing this, consider doing a [progressive
deployment](#progressive-deployment) instead.
### Batch jobs on all hosts
With that trick, a job can be ran on all hosts with
[parallel-ssh][], for example, check the `uptime`:
cut -d, -f1 hosts.hsv | parallel-ssh -i -h /dev/stdin uptime
This would do the same, but only on physical servers:
grep 'physical$' hosts.hsv | cut -d -f1 | parallel-ssh -i -h /dev/stdin uptime
This would fetch the `/etc/motd` on all machines:
cut -d -f1 hosts.csv | parallel-slurp -h /dev/stdin -L motd /etc/motd motd
To run batch commands through `sudo` that requires a password, you will need to
fool both `sudo` and ssh a little more:
cut -d -f1 hosts.csv | parallel-ssh -P -I -i -x -tt -h /dev/stdin -o pvs sudo pvs
You should then type your password then Control-d. Warning: this will
show your password on your terminal and probably in the logs as well.
Batch jobs can also be ran on all Puppet hosts with Cumin:
ssh -N -L8080:localhost:8080 puppetdb-01.torproject.org &
cumin '*' uptime
See [howto/cumin](howto/cumin) for more examples.
[parallel-ssh]: https://parallel-ssh.org/
Another option for batch jobs is [tmux-xpanes](https://github.com/greymd/tmux-xpanes).
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
### Progressive deployment
If you are making a major change to the infrastructure, you may want
to deploy it progressively. A good way to do so is to include the new
class manually in an existing role, say in
`modules/role/manifests/foo.pp`:
class role::foo {
include my_new_class
}
Then you can check the effect of the class on the host with the
`--noop` mode. Make sure you disable Puppet so that automatic runs do
not actually execute the code, with:
puppet agent --disable "testing my_new_class deployment"
Then the new manifest can be simulated with this command:
puppet agent --enable ; puppet agent -t --noop ; puppet agent --disable "testing my_new_class deployment"
Examine the output and, once you are satisfied, you can re-enable the
agent and actually run the manifest with:
puppet agent --enable ; puppet agent -t
If the change is *inside* an existing class, that change can be
enclosed in a class parameter and that parameter can be passed as an
argument from Hiera. This is how the transition to a managed
`/etc/apt/sources.list` file was done:
1. first, a parameter was added to the class that would remove the
file, defaulting to `false`:
class torproject_org(
Boolean $manage_sources_list = false,
) {
if $manage_sources_list {
# the above repositories overlap with most default sources.list
file {
'/etc/apt/sources.list':
ensure => absent,
}
}
}
2. then that parameter was enabled on one host, say in
`hiera/nodes/brulloi.torproject.org.yaml`:
torproject_org::manage_sources_list: true
3. Puppet was run on that host using the simulation mode:
puppet agent --enable ; puppet agent -t --noop ; puppet agent --disable "testing my_new_class deployment"
4. when satisfied, the real operation was done:
puppet agent --enable ; puppet agent -t --noop
5. then this was added to two other hosts, and Puppet was ran there
6. finally, all hosts were checked to see if the file was present on
hosts and had any content, with [howto/cumin](howto/cumin) (see above for
alternative way of running a command on all hosts):
cumin '*' 'du /etc/apt/sources.list'
7. since it was missing everywhere, the parameter was set to `true`
by default and the custom configuration removed from the three
test nodes
8. then Puppet was ran by hand everywhere, using Cumin, with a batch
of 5 hosts at a time:
cumin -o txt -b 5 '*' 'puppet agent -t'
because Puppet returns a non-zero value when changes are made,
this will above when any one host in a batch of 5 will actually
operate a change. You can then examine the output and see if the
change is legitimate or abort the configuration change.
Once the Puppet agent is disabled on all nodes, it's possible to enable
it and run the agent only on nodes that still have the agent disabled.
This way it's possible to "resume" a deployment when a problem or
change causes the `cumin` run to abort.
cumin -b 5 '*' 'if test -f /var/lib/puppet/state/agent_disabled.lock; then puppet agent --enable ; puppet agent -t ; fi'
Because the output `cumin` produces groups together nodes that return
identical output, and because `puppet agent -t` outputs unique
strings like catalog serial number and runtime in fractions of a
second, we have made a wrapper called `patc` that will silence those
and will allow cumin to group those commands together:
### Adding/removing a global admin
To add a new sysadmin, you need to add their SSH key to the root
account everywhere. This can be done in the `profile::admins::key`
field in `hiera/common.yaml`.
You also need to add them to the `adm` group in LDAP, see [adding
users to a group in LDAP](howto/ldap#adding-removing-users-in-a-group).
### Consult the logs of past local Puppet agent runs
The command `journalctl` can be used to consult puppet agent logs on
the local machine:
To view limit logs to the last day only:
journalctl -t puppet-agent --since=-1d
### Running Puppet by hand and logging
When a Puppet manifest is not behaving as it should, the first step is
to run it by hand on the host:
If that doesn't yield enough information, you can see pretty much
everything that Puppet does with the `--debug` flag. This will, for
example, include `Exec` resources `onlyif` commands and allow you to
see why they do not work correctly (a common problem):