That "howto" section was getting really unwieldy. Before: - Modifying an existing configuration - Adding a new module - Contributing changes back upstream - Running tests - Validating Puppet code - Listing all hosts under puppet - Other ways of extracting a host list - Running Puppet everywhere - Batch jobs on all hosts - Progressive deployment - Adding/removing a global admin - Examining a Puppet catalog - List resources by type - View/filter full catalog - Troubleshooting - Consult the logs of past local Puppet agent runs - Running Puppet by hand and logging - Finding exported resources with SQL queries - Finding exported resources with PuppetDB - Password management - Getting information from other nodes - Exported resources - PuppetDB lookups - Puppet query language - LDAP lookups - Hiera lookups - Revoking and generating a new certificate for a host - Pager playbook - [catalog run: PuppetDB warning: did not update since \...\] - Problems pushing to the Puppet server - Disaster recovery 31 entries, 17 top level, 14 second. After: - Programming workflow - Modifying an existing configuration - Adding a new module - Contributing changes back upstream - Running tests - Validating Puppet code - Puppet tricks - Password management - Getting information from other nodes - Exported resources - PuppetDB lookups - Puppet query language - LDAP lookups - Hiera lookups - Revoking and generating a new certificate for a host - Deployments - Listing all hosts under puppet - Other ways of extracting a host list - Running Puppet everywhere - Batch jobs on all hosts - Progressive deployment - Adding/removing a global admin - Troubleshooting - Consult the logs of past local Puppet agent runs - Running Puppet by hand and logging - Finding exported resources with SQL queries - Finding exported resources with PuppetDB - Examining a Puppet catalog - List resources by type - View/filter full catalog - Pager playbook - [catalog run: PuppetDB warning: did not update since \...\] - Problems pushing to the Puppet server - Disaster recovery 34 entries, 6 top level, 23 second level, 7 third level The caveat here is that the "third level" here is actually a *fourth* level, something which we're typically trying to avoid, but in this case this is not really a possibility.

anarcat · f491696e
--- a/howto/puppet.md
+++ b/howto/puppet.md
@@ -134,7 +134,9 @@ affects a lot of machines.

 # How-to

-## Modifying an existing configuration
+## Programming workflow
+
+### Modifying an existing configuration

 For new deployments, this is *NOT* the preferred method. For example,
 if you are deploying new software that is not already in use in our
@@ -151,7 +153,7 @@ code to modify. If you are making changes that potentially affect more
 than one host, you should also definitely look at the `Progressive
 deployment` section below.

-## Adding a new module
+### Adding a new module

 This is a broad topic, but let's take the Prometheus monitoring system
 as an example which followed the [role/profile/module][]
@@ -238,7 +240,7 @@ If you need to deploy the code to multiple hosts, see the `Progressive
 deployment` section below. To contribute changes back upstream (and
 you should do so), see the section right below.

-## Contributing changes back upstream
+### Contributing changes back upstream

 For simple changes, the above workflow works well, but eventually it
 is preferable to actually fork the upstream repository and operate on our
@@ -315,7 +317,7 @@ modules:
 This will *also* update dependencies so make sure you audit those
 changes before committing and pushing.

-## Running tests
+### Running tests

 Ideally, Puppet modules have a test suite. This is done with
 [rspec-puppet](https://rspec-puppet.com/) and [rspec-puppet-facts](https://github.com/mcanevet/rspec-puppet-facts). This is not very well
@@ -339,7 +341,7 @@ Finally, to run the tests, you need to wrap your invocation with

    bundle exec rake test

-## Validating Puppet code
+### Validating Puppet code

 You SHOULD run validation checks on commit locally before pushing your
 manifests. To install those hooks, you should clone this repository:
@@ -354,143 +356,475 @@ A server-side validation hook hasn't been enabled yet because our
 manifests would sometimes fail and the hook was found to be somewhat
 slow. That is being worked on in [issue 31226][].

-## Listing all hosts under puppet
-
-This will list all active hosts known to the Puppet master:
-
-    ssh -t pauli.torproject.org 'sudo -u postgres psql puppetdb -P pager=off -A -t -c "SELECT c.certname FROM certnames c WHERE c.deactivated IS NULL"'
+## Puppet tricks

-The following will list all hosts under Puppet and their `virtual`
-value:
+### Password management

-    ssh -t pauli.torproject.org "sudo -u postgres psql puppetdb -P pager=off -F',' -A -t -c \"SELECT c.certname, value_string FROM factsets fs INNER JOIN facts f ON f.factset_id = fs.id INNER JOIN fact_values fv ON fv.id = f.fact_value_id INNER JOIN fact_paths fp ON fp.id = f.fact_path_id INNER JOIN certnames c ON c.certname = fs.certname WHERE fp.name = 'virtual' AND c.deactivated IS NULL\""  | tee hosts.csv
+If you need to set a password in a manifest, there are special
+functions to handle this. We do not want to store passwords directly
+in Puppet source code, for various reasons: it is hard to erase
+because code is stored in git, but also, ultimately, we want to
+publish that source code publicly.

-The resulting file is a Comma-Separated Value (CSV) file which can be
-used for other purposes later.
+We use Trocla for this purpose, which generates
+random passwords and stores the hash or, if necessary, the clear-text
+in a YAML file.

-Possible values of the `virtual` field can be obtain with a similar
-query:
+With Trocla, each password is generated on the fly from a secure
+entropy source ([Ruby's SecureRandom module][]) and stored inside a
+state file (in `/var/lib/trocla/trocla_data.yml`, configured
+`/etc/puppet/troclarc.yaml`) on the Puppet master.

-    ssh -t pauli.torproject.org "sudo -u postgres psql puppetdb -P pager=off -A -t -c \"SELECT DISTINCT value_string FROM factsets fs INNER JOIN facts f ON f.factset_id = fs.id INNER JOIN fact_values fv ON fv.id = f.fact_value_id INNER JOIN fact_paths fp ON fp.id = f.fact_path_id WHERE fp.name = 'virtual';\""
+Trocla can return "hashed" versions of the passwords, so that the
+plain text password is never visible from the client. The plain text
+can still be stored on the Puppet master, or it can be deleted once
+it's been transmitted to the user or another password manager. This
+makes it possible to have Trocla not keep any secret at all.

-The currently known values are: `kvm`, `physical`, and `xenu`.
+[Ruby's SecureRandom module]: https://ruby-doc.org/stdlib-1.9.3/libdoc/securerandom/rdoc/SecureRandom.html
+[Trocla]: https://github.com/duritong/trocla

-As a bonus, this query will show the number of hosts running each release:
+This piece of code will generate a [bcrypt][]-hashed password for the
+Grafana admin, for example:

-    SELECT COUNT(c.certname), value_string FROM factsets fs INNER JOIN facts f ON f.factset_id = fs.id INNER JOIN fact_values fv ON fv.id = f.fact_value_id INNER JOIN fact_paths fp ON fp.id = f.fact_path_id INNER JOIN certnames c ON c.certname = fs.certname WHERE fp.name = 'lsbdistcodename' AND c.deactivated IS NULL GROUP BY value_string;
+    $grafana_admin_password = trocla('grafana_admin_password', 'bcrypt')

-### Other ways of extracting a host list
+The plain-text for that password will never leave the Puppet master. it
+will still be stored on the Puppet master, and you can see the value
+with:

- * Using the [PuppetDB API][]:
+    trocla get grafana_admin_password plain

-        curl -s -G http://localhost:8080/pdb/query/v4/facts  | jq -r ".[].certname"
+... on the command-line.

-   The [fact API][] is quite extensive and allows for very complex
-   queries. For example, this shows all hosts with the `apache2` fact
-   set to `true`:
+[bcrypt]: https://en.wikipedia.org/wiki/Bcrypt

-        curl -s -G http://localhost:8080/pdb/query/v4/facts --data-urlencode 'query=["and", ["=", "name", "apache2"], ["=", "value", true]]' | jq -r ".[].certname"
+A password can also be set with this command:

-   This will list all hosts sorted by their report date, older first,
-   followed by the timestamp, space-separated:
+    trocla set grafana_guest_password plain

-        curl -s -G http://localhost:8080/pdb/query/v4/nodes  | jq -r 'sort_by(.report_timestamp) | .[] | "\(.certname) \(.report_timestamp)"' | column -s\  -t
+Note that this might *erase* other formats for this password, although
+those will get regenerated as needed.

-   This will list all hosts with the `roles::static_mirror` class:
+Also note that `trocla get` will fail if the particular password or
+format requested does not exist. For example, say you generate a
+plain-text password with and then get the `bcrypt` version:

-        curl -s -G http://localhost:8080/pdb/query/v4 --data-urlencode 'query=inventory[certname] { resources { type = "Class" and title = "Roles::Static_mirror" }} ' | jq -r .[].certname
+    trocla create test plain
+    trocla get test bcrypt

-   This will show all hosts running Debian buster:
+This will return the empty string instead of the hashed
+version. Instead, use `trocla create` to generate that password. In
+general, it's safe to use `trocla create` as it will reuse existing
+password. It's actually how the `trocla()` function behaves in Puppet
+as well.

-        curl -s -G http://localhost:8080/pdb/query/v4 --data-urlencode 'query=nodes { facts { name = "lsbdistcodename" and value = "buster" }}' | jq -r .[].certname
+TODO: Trocla can provide passwords to classes transparently, without
+having to do function calls inside Puppet manifests. For example, this
+code:

- * Using [howto/cumin](howto/cumin)
+    class profile::grafana {
+        $password = trocla('profile::grafana::password', 'plain')
+        # ...
+    }

- * Using LDAP:
+Could simply be expressed as:

-        HOSTS=$(ssh alberti.torproject.org 'ldapsearch -h db.torproject.org -x -ZZ -b dc=torproject,dc=org -LLL "hostname=*.torproject.org" hostname | awk "\$1 == \"hostname:\" {print \$2}" | sort')
-        for i in `echo $HOSTS`; do mkdir hosts/x-$i 2>/dev/null || continue; echo $i; ssh $i ' ...'; done
+    class profile::grafana(String $password) {
+        # ...
+    }

-    the `mkdir` is so that I can run the same command in many terminal
-    windows and each host gets only one once
+But this requires a few changes:
+ 
+ 1. Trocla needs to be included in Hiera
+ 2. We need roles to be more clearly defined in Hiera, and use Hiera
+    as an ENC so that we can do per-roles passwords (for example),
+    which is not currently possible.

- [PuppetDB API]: https://puppet.com/docs/puppetdb/4.3/api/index.html
- [fact API]: https://puppet.com/docs/puppetdb/4.3/api/query/v4/facts.html
+### Getting information from other nodes

-## Running Puppet everywhere
+A common pattern in Puppet is to deploy resources on a given host with
+information from another host. For example, you might want to grant
+access to host A from host B. And while you can hardcode host B's IP
+address in host A's manifest, it's not good practice: if host B's IP
+address changes, you need to change the manifest, and that practice
+makes it difficult to introduce host C into the pool...

-There are many ways to [run a command on all hosts (see next
-section)][], but the TL;DR: is to basically use [cumin](howto/cumin)
-and run this command:
+So we need ways of having a node use information from other nodes in
+our Puppet manifests. There are 5 methods in our Puppet source code at
+the time of writing:

-[run a command on all hosts (see next section)]: #batch-jobs-on-all-hosts
+ * Exported resources
+ * PuppetDB lookups
+ * Puppet Query Language (PQL)
+ * LDAP lookups
+ * Hiera lookups

-    cumin -o txt -b 5 '*' 'puppet agent -t'
+This section walks through how each method works, outlining the
+advantage/disadvantage of each.

-But before doing this, consider doing a [progressive
-deployment](#progressive-deployment) instead.
+#### Exported resources

-## Batch jobs on all hosts
+Our Puppet configuration supports [exported resources](https://puppet.com/docs/puppet/latest/lang_exported.html), a key
+component of complex Puppet deployments. Exported resources allow one
+host to define a configuration that will be *exported* to the Puppet
+server and then *realized* on another host.

-With that trick, a job can be ran on all hosts with
-[parallel-ssh][], for example, check the `uptime`:
+We commonly use this to punch holes in the firewall between nodes. For
+example, this manifest in the `roles::puppetmaster` class:

-    cut -d, -f1 hosts.hsv | parallel-ssh -i -h /dev/stdin uptime
+    @@ferm::rule::simple { "roles::puppetmaster-${::fqdn}":
+        tag         => 'roles::puppetmaster',
+        description => 'Allow Puppetmaster access to LDAP',
+        port        => ['ldap', 'ldaps'],
+        saddr       => $base::public_addresses,
+      }

-This would do the same, but only on physical servers:
+... exports a firewall rule that will, later, allow the Puppet server
+to access the LDAP server (hence the `port => ['ldap', 'ldaps']`
+line). This rule doesn't take effect on the host applying the
+`roles::puppetmaster` class, but only on the LDAP server, through this
+rather exotic syntax:

-    grep 'physical$' hosts.hsv | cut -d -f1 | parallel-ssh -i -h /dev/stdin uptime
+    Ferm::Rule::Simple <<| tag == 'roles::puppetmaster' |>>

-This would fetch the `/etc/motd` on all machines:
+This tells the LDAP server to apply whatever rule was exported with
+the `@@` syntax and the specified `tag`. Any Puppet resource can be
+exported and realized that way.

-    cut -d -f1 hosts.csv | parallel-slurp -h /dev/stdin -L motd /etc/motd motd
+Note that there are security implications with collecting exported
+resources: it delegates the resource specification of a node to
+another. So, in the above scenario, the Puppet master could decide to
+open *other* ports on the LDAP server (say, the SSH port), because it
+exports the port number and the LDAP server just blindly applies the
+directive. A more secure specification would explicitly specify the
+sensitive information, like so:

-To run batch commands through `sudo` that requires a password, you will need to fool both `sudo` and ssh a little more:
+    Ferm::Rule::Simple <<| tag == 'roles::puppetmaster' |>> {
+        port => ['ldap'],
+    }

-    cut -d -f1 hosts.csv | parallel-ssh -P -I -i -x -tt -h /dev/stdin -o pvs sudo pvs
+But then a compromised server could send a different `saddr` and
+there's nothing the LDAP server could do here: it cannot override the
+address because it's exactly the information we need from the other
+server...

-You should then type your password then Control-d. Warning: this will
-show your password on your terminal and probably in the logs as well.
+#### PuppetDB lookups

-Batch jobs can also be ran on all Puppet hosts with Cumin:
+A common pattern in Puppet is to extract information from host A and
+use it on host B. The above "exported resources" pattern can do this
+for files, commands and many more resources, but sometimes we just
+want a tiny bit of information to embed in a configuration file. This
+could, in theory, be done with an exported [concat](https://forge.puppet.com/puppetlabs/concat) resource, but
+this can become prohibitively complicated for something as simple as
+an allowed IP address in a configuration file.

-    ssh -N -L8080:localhost:8080 pauli.torproject.org &
-    cumin '*' uptime
+For this we use the [puppetdbquery module](https://github.com/dalen/puppet-puppetdbquery), which allows us to do
+elegant queries against PuppetDB. For example, this will extract the
+IP addresses of all nodes with the `roles::gitlab` class applied:

-See [howto/cumin](howto/cumin) for more examples.
+    $allow_ipv4 = query_nodes('Class[roles::gitlab]', 'networking.ip')
+    $allow_ipv6 = query_nodes('Class[roles::gitlab]', 'networking.ip6')

- [parallel-ssh]: https://parallel-ssh.org/
+This code, in `profile::kgb_bot`, propagates those variables into a
+template through the `allowed_addresses` variable, which gets expanded
+like this:

-## Progressive deployment
+    <% if $allow_addresses { -%>
+    <% $allow_addresses.each |String $address| { -%>
+        allow <%= $address %>;
+    <% } -%>
+        deny all;
+    <% } -%>

-If you are making a major change to the infrastructure, you may want
-to deploy it progressively. A good way to do so is to include the new
-class manually in an existing role, say in
-`modules/role/manifests/foo.pp`:
+Note that there is a potential security issue with that approach. The
+same way that exported resources trust the exporter, we trust that the
+node exported the right fact. So it's in theory possible that a
+compromised Puppet node exports an evil IP address in the above
+example, granting access to an attacker instead of the proper node. If
+that is a concern, consider using LDAP or Hiera lookups instead.

-    class role::foo {
-      include my_new_class
-    }
+Also note that this will eventually fail when the node goes down:
+after a while, resources are expired from the PuppetDB server and the
+above query will return an empty list. This seems reasonable: we do
+want to eventually revoke access to nodes that go away, but it's still
+something to keep in mind.

-Then you can check the effect of the class on the host with the
-`--noop` mode. Make sure you disable Puppet so that automatic runs do
-not actually execute the code, with:
+Keep in mind that the `networking.ip` fact, in the above example,
+might be incorrect in the case of a host that's behind NAT. In that
+case, you should use LDAP or Hiera lookups.

-    puppet agent --disable "testing my_new_class deployment"
+Note that this could also be implemented with a `concat` exported
+resource, but much harder because you would need some special case
+when no resource is exported (to avoid adding the `deny`) and take
+into account that other configurations might also be needed in the
+file. It would have the same security and expiry issues anyways.

-Then the new manifest can be simulated with this command:
+#### Puppet query language

-    puppet agent --enable ; puppet agent -t --noop ; puppet agent --disable "testing my_new_class deployment"
+Note that there's also a way to do those queries without a Forge
+module, through the [Puppet query language](https://puppet.com/docs/puppetdb/5.2/api/query/tutorial-pql.html) and the
+`puppetdb_query` function. The problem with that approach is that the
+function is not very well documented and the query syntax is somewhat
+obtuse. For example, this is what I came up with to do the equivalent
+of the `query_nodes` call, above:

-Examine the output and, once you are satisfied, you can re-enable the
-agent and actually run the manifest with:
+    $allow_ipv4 = puppetdb_query(
+      ['from', 'facts',
+        ['and',
+          ['=', 'name', 'networking.ip'],
+          ['in', 'certname',
+            ['extract', 'certname',
+              ['select_resources',
+                ['and',
+                  ['=', 'type', 'Class'],
+                  ['=', 'title', 'roles::gitlab']]]]]]])

-    puppet agent --enable ; puppet agent -t
+It seems like I did something wrong, because that returned an empty
+array. I could not figure out how to debug this, and apparently I
+needed more functions (like `map` and `filter`) to get what I wanted
+(see [this gist](https://gist.github.com/bastelfreak/b9620fa1892ebcc659c442b115db34f9)). I gave up at that point: the `puppetdbquery`
+abstraction is much cleaner and usable.

-If the change is *inside* an existing class, that change can be
-enclosed in a class parameter and that parameter can be passed as an
-argument from Hiera. This is how the transition to a managed
-`/etc/apt/sources.list` file was done:
+If you are merely looking for a hostname, however, PQL might be a
+little more manageable. For example, this is how the
+`roles::onionoo_frontend` class finds its backends to setup the
+[IPsec](ipsec) network:
+
+    $query = 'nodes[certname] { resources { type = "Class" and title = "Roles::Onionoo_backend" } }'
+    $peer_names = sort(puppetdb_query($query).map |$value| { $value["certname"] })
+    $peer_names.each |$peer_name| {
+      $network_tag = [$::fqdn, $peer_name].sort().join('::')
+      ipsec::network { "ipsec::${network_tag}":
+        peer_networks => $base::public_addresses
+      }
+    }
+
+#### LDAP lookups
+
+Our Puppet server is hooked up to the LDAP server and has information
+about the hosts defined there. Information about the node running the
+manifest is available in the global `$nodeinfo` variable, but there is
+also a `$allnodeinfo` parameter with information about every host
+known in LDAP.
+
+A simple example of how to use the `$nodeinfo` variable is how the
+`base::public_address` and `base::public_address6` parameters -- which
+represent the IPv4 and IPv6 public address of a node -- are
+initialized in the `base` class:
+
+    class base(
+      Stdlib::IP::Address $public_address            = filter_ipv4(getfromhash($nodeinfo, 'ldap', 'ipHostNumber'))[0],
+      Optional[Stdlib::IP::Address] $public_address6 = filter_ipv6(getfromhash($nodeinfo, 'ldap', 'ipHostNumber'))[0],
+    ) {
+      $public_addresses = [ $public_address, $public_address6 ].filter |$addr| { $addr != undef }
+    }
+
+This loads the `ipHostNumber` field from the `$nodeinfo` variable, and
+uses the `filter_ipv4` or `filter_ipv6` functions to extract the IPv4
+or IPv6 addresses respectively.
+
+A good example of the `$allnodeinfo` parameter is how the
+`roles::onionoo_frontend` class finds the IP addresses of its
+backend. After having loaded the host list from PuppetDB, it then uses
+the parameter to extract the IP address:
+
+    $backends = $peer_names.map |$name| {
+        [
+          $name,
+          $allnodeinfo[$name]['ipHostNumber'].filter |$a| { $a =~ Stdlib::IP::Address::V4 }[0]
+        ] }.convert_to(Hash)
+
+Such a lookup is considered more secure than going through PuppetDB as
+LDAP is a trusted data source. It is also our source of truth for this
+data, at the time of writing.
+
+#### Hiera lookups
+
+For more security-sensitive data, we should use a trusted data source
+to extract information about hosts. We do this through Hiera lookups,
+with the [lookup](https://puppet.com/docs/puppet/latest/function.html#lookup) function. A good example is how we populate the
+SSH public keys on all hosts, for the admin user. In the
+`profile::ssh` class, we do the following:
+
+    $keys = lookup('profile::admins::keys', Data, 'hash')
+
+This will lookup the `profile::admin::keys` field in Hiera, which is a
+trusted source because under the control of the Puppet git repo. This
+refers to the following data structure in `hiera/common.yaml`:
+
+    profile::admins::keys:
+      anarcat:
+        type: "ssh-rsa"
+        pubkey: "AAAAB3[...]"
+
+The key point with Hiera is that it's a "hierarchical" data structure,
+so each host can have its own override. So in theory, the above keys
+could be overridden per host. Similarly, the IP address information for
+each host could be stored in Hiera instead of LDAP. But in practice,
+we do not currently do this and the per-host information is limited.
+
+### Revoking and generating a new certificate for a host
+
+Revocation procedures problems were discussed in [33587][] and [33446][].
+
+[33587]: https://bugs.torproject.org/33587
+[33446]: https://gitlab.torproject.org/legacy/trac/-/issues/33446#note_2349434
+
+ 1. Clean the certificate on the master
+
+        puppet cert clean host.torproject.org
+
+ 2. Clean the certificate on the client:
+
+        find /var/lib/puppet/ssl -name host.torproject.org.pem -delete
+
+ 3. Then run the bootstrap script on the client from
+    `tsa-misc/installer/puppet-bootstrap-client` and get a new checksum
+
+ 4. Run `tpa-puppet-sign-client` on the master and pass the checksum
+
+ 5. Run `puppet agent -t` to have puppet running on the client again.
+
+## Deployments
+
+### Listing all hosts under puppet
+
+This will list all active hosts known to the Puppet master:
+
+    ssh -t pauli.torproject.org 'sudo -u postgres psql puppetdb -P pager=off -A -t -c "SELECT c.certname FROM certnames c WHERE c.deactivated IS NULL"'
+
+The following will list all hosts under Puppet and their `virtual`
+value:
+
+    ssh -t pauli.torproject.org "sudo -u postgres psql puppetdb -P pager=off -F',' -A -t -c \"SELECT c.certname, value_string FROM factsets fs INNER JOIN facts f ON f.factset_id = fs.id INNER JOIN fact_values fv ON fv.id = f.fact_value_id INNER JOIN fact_paths fp ON fp.id = f.fact_path_id INNER JOIN certnames c ON c.certname = fs.certname WHERE fp.name = 'virtual' AND c.deactivated IS NULL\""  | tee hosts.csv
+
+The resulting file is a Comma-Separated Value (CSV) file which can be
+used for other purposes later.
+
+Possible values of the `virtual` field can be obtain with a similar
+query:
+
+    ssh -t pauli.torproject.org "sudo -u postgres psql puppetdb -P pager=off -A -t -c \"SELECT DISTINCT value_string FROM factsets fs INNER JOIN facts f ON f.factset_id = fs.id INNER JOIN fact_values fv ON fv.id = f.fact_value_id INNER JOIN fact_paths fp ON fp.id = f.fact_path_id WHERE fp.name = 'virtual';\""
+
+The currently known values are: `kvm`, `physical`, and `xenu`.
+
+As a bonus, this query will show the number of hosts running each release:
+
+    SELECT COUNT(c.certname), value_string FROM factsets fs INNER JOIN facts f ON f.factset_id = fs.id INNER JOIN fact_values fv ON fv.id = f.fact_value_id INNER JOIN fact_paths fp ON fp.id = f.fact_path_id INNER JOIN certnames c ON c.certname = fs.certname WHERE fp.name = 'lsbdistcodename' AND c.deactivated IS NULL GROUP BY value_string;
+
+### Other ways of extracting a host list
+
+ * Using the [PuppetDB API][]:
+
+        curl -s -G http://localhost:8080/pdb/query/v4/facts  | jq -r ".[].certname"
+
+   The [fact API][] is quite extensive and allows for very complex
+   queries. For example, this shows all hosts with the `apache2` fact
+   set to `true`:
+
+        curl -s -G http://localhost:8080/pdb/query/v4/facts --data-urlencode 'query=["and", ["=", "name", "apache2"], ["=", "value", true]]' | jq -r ".[].certname"
+
+   This will list all hosts sorted by their report date, older first,
+   followed by the timestamp, space-separated:
+
+        curl -s -G http://localhost:8080/pdb/query/v4/nodes  | jq -r 'sort_by(.report_timestamp) | .[] | "\(.certname) \(.report_timestamp)"' | column -s\  -t
+
+   This will list all hosts with the `roles::static_mirror` class:
+
+        curl -s -G http://localhost:8080/pdb/query/v4 --data-urlencode 'query=inventory[certname] { resources { type = "Class" and title = "Roles::Static_mirror" }} ' | jq -r .[].certname
+
+   This will show all hosts running Debian buster:
+
+        curl -s -G http://localhost:8080/pdb/query/v4 --data-urlencode 'query=nodes { facts { name = "lsbdistcodename" and value = "buster" }}' | jq -r .[].certname
+
+ * Using [howto/cumin](howto/cumin)
+
+ * Using LDAP:
+
+        HOSTS=$(ssh alberti.torproject.org 'ldapsearch -h db.torproject.org -x -ZZ -b dc=torproject,dc=org -LLL "hostname=*.torproject.org" hostname | awk "\$1 == \"hostname:\" {print \$2}" | sort')
+        for i in `echo $HOSTS`; do mkdir hosts/x-$i 2>/dev/null || continue; echo $i; ssh $i ' ...'; done
+
+    the `mkdir` is so that I can run the same command in many terminal
+    windows and each host gets only one once
+
+ [PuppetDB API]: https://puppet.com/docs/puppetdb/4.3/api/index.html
+ [fact API]: https://puppet.com/docs/puppetdb/4.3/api/query/v4/facts.html
+
+### Running Puppet everywhere
+
+There are many ways to [run a command on all hosts (see next
+section)][], but the TL;DR: is to basically use [cumin](howto/cumin)
+and run this command:
+
+[run a command on all hosts (see next section)]: #batch-jobs-on-all-hosts
+
+    cumin -o txt -b 5 '*' 'puppet agent -t'
+
+But before doing this, consider doing a [progressive
+deployment](#progressive-deployment) instead.
+
+### Batch jobs on all hosts
+
+With that trick, a job can be ran on all hosts with
+[parallel-ssh][], for example, check the `uptime`:
+
+    cut -d, -f1 hosts.hsv | parallel-ssh -i -h /dev/stdin uptime
+
+This would do the same, but only on physical servers:
+
+    grep 'physical$' hosts.hsv | cut -d -f1 | parallel-ssh -i -h /dev/stdin uptime
+
+This would fetch the `/etc/motd` on all machines:
+
+    cut -d -f1 hosts.csv | parallel-slurp -h /dev/stdin -L motd /etc/motd motd
+
+To run batch commands through `sudo` that requires a password, you will need to fool both `sudo` and ssh a little more:
+
+    cut -d -f1 hosts.csv | parallel-ssh -P -I -i -x -tt -h /dev/stdin -o pvs sudo pvs
+
+You should then type your password then Control-d. Warning: this will
+show your password on your terminal and probably in the logs as well.
+
+Batch jobs can also be ran on all Puppet hosts with Cumin:
+
+    ssh -N -L8080:localhost:8080 pauli.torproject.org &
+    cumin '*' uptime
+
+See [howto/cumin](howto/cumin) for more examples.
+
+ [parallel-ssh]: https://parallel-ssh.org/
+
+### Progressive deployment
+
+If you are making a major change to the infrastructure, you may want
+to deploy it progressively. A good way to do so is to include the new
+class manually in an existing role, say in
+`modules/role/manifests/foo.pp`:
+
+    class role::foo {
+      include my_new_class
+    }
+
+Then you can check the effect of the class on the host with the
+`--noop` mode. Make sure you disable Puppet so that automatic runs do
+not actually execute the code, with:
+
+    puppet agent --disable "testing my_new_class deployment"
+
+Then the new manifest can be simulated with this command:
+
+    puppet agent --enable ; puppet agent -t --noop ; puppet agent --disable "testing my_new_class deployment"
+
+Examine the output and, once you are satisfied, you can re-enable the
+agent and actually run the manifest with:
+
+    puppet agent --enable ; puppet agent -t
+
+If the change is *inside* an existing class, that change can be
+enclosed in a class parameter and that parameter can be passed as an
+argument from Hiera. This is how the transition to a managed
+`/etc/apt/sources.list` file was done:

 1. first, a parameter was added to the class that would remove the
    file, defaulting to `false`:
@@ -557,7 +891,7 @@ and will allow cumin to group those commands together:

    cumin -b 5 '*' 'patc'

-## Adding/removing a global admin
+### Adding/removing a global admin

 To add a new sysadmin, you need to add their SSH key to the root
 account everywhere. This can be done in the `profile::admins::key`
@@ -566,70 +900,32 @@ field in `hiera/common.yaml`.
 You also need to add them to the `adm` group in LDAP, see [adding
 users to a group in LDAP](howto/ldap#adding-removing-users-in-a-group).

-## Examining a Puppet catalog
+## Troubleshooting

-It can sometimes be useful to examine a node's catalog in order to
-determine if certain resources are present, or to view a resource's
-full set of parameters.
+### Consult the logs of past local Puppet agent runs

-### List resources by type
+The command `journalctl` can be used to consult puppet agent logs on
+the local machine:

-To list all `service` resources managed by Puppet on a node, the
-command below may be executed on the node itself:
+    journalctl -t puppet-agent

-    puppet catalog select --terminus rest "$(hostname -f)" service
+To view limit logs to the last day only:

-At the end of the command line, `service` may be replaced by any
-built-in resource types such as `file` or `cron`. Defined resource
-names may also be used here, like `ssl::service`.
+    journalctl -t puppet-agent --since=-1d

-### View/filter full catalog
+### Running Puppet by hand and logging

-To extract a node's full catalog in JSON format:
+When a Puppet manifest is not behaving as it should, the first step is
+to run it by hand on the host:

-    puppet catalog find --terminus rest "$(hostname -f)"
+    puppet agent -t

-The output can be manipulated using `jq` to extract more precise
-information. For example, to list all resources of a specific type:
+If that doesn't yield enough information, you can see pretty much
+everything that Puppet does with the `--debug` flag. This will, for
+example, include `Exec` resources `onlyif` commands and allow you to
+see why they do not work correctly (a common problem):

-    jq '.resources[] | select(.type == "File") | .title' < catalog.json
-
-To list all classes in the catalog:
-
-    jq '.resources[] | select(.type=="Class") | .title' < catalog.json
-
-To display a specific resource selected by title:
-
-    jq '.resources[] | select((.type == "File") and (.title=="sources.list.d"))' < catalog.json
-
-More examples can be found on this [blog post](http://web.archive.org/web/20210122003128/https://alexharv074.github.io/puppet/2017/11/30/jq-commands-for-puppet-catalogs.html).OB
-
-## Troubleshooting
-
-## Consult the logs of past local Puppet agent runs
-
-The command `journalctl` can be used to consult puppet agent logs on
-the local machine:
-
-    journalctl -t puppet-agent
-
-To view limit logs to the last day only:
-
-    journalctl -t puppet-agent --since=-1d
-
-### Running Puppet by hand and logging
-
-When a Puppet manifest is not behaving as it should, the first step is
-to run it by hand on the host:
-
-    puppet agent -t
-
-If that doesn't yield enough information, you can see pretty much
-everything that Puppet does with the `--debug` flag. This will, for
-example, include `Exec` resources `onlyif` commands and allow you to
-see why they do not work correctly (a common problem):
-
-    puppet agent -t --debug
+    puppet agent -t --debug

 Finally, some errors show up only on the Puppet server: you can look in
 `/var/log/daemon.log` there for errors that will only show up there.
@@ -667,333 +963,43 @@ in the manifests) of `backup-blah@backup.koumbit.net`:
 TODO: update the above query to match resources actually in use at
 TPO. That example is from koumbit.org folks.

-## Password management
-
-If you need to set a password in a manifest, there are special
-functions to handle this. We do not want to store passwords directly
-in Puppet source code, for various reasons: it is hard to erase
-because code is stored in git, but also, ultimately, we want to
-publish that source code publicly.
-
-We use Trocla for this purpose, which generates
-random passwords and stores the hash or, if necessary, the clear-text
-in a YAML file.
-
-With Trocla, each password is generated on the fly from a secure
-entropy source ([Ruby's SecureRandom module][]) and stored inside a
-state file (in `/var/lib/trocla/trocla_data.yml`, configured
-`/etc/puppet/troclarc.yaml`) on the Puppet master.
-
-Trocla can return "hashed" versions of the passwords, so that the
-plain text password is never visible from the client. The plain text
-can still be stored on the Puppet master, or it can be deleted once
-it's been transmitted to the user or another password manager. This
-makes it possible to have Trocla not keep any secret at all.
-
-[Ruby's SecureRandom module]: https://ruby-doc.org/stdlib-1.9.3/libdoc/securerandom/rdoc/SecureRandom.html
-[Trocla]: https://github.com/duritong/trocla
-
-This piece of code will generate a [bcrypt][]-hashed password for the
-Grafana admin, for example:
-
-    $grafana_admin_password = trocla('grafana_admin_password', 'bcrypt')
-
-The plain-text for that password will never leave the Puppet master. it
-will still be stored on the Puppet master, and you can see the value
-with:
-
-    trocla get grafana_admin_password plain
-
-... on the command-line.
-
-[bcrypt]: https://en.wikipedia.org/wiki/Bcrypt
-
-A password can also be set with this command:
-
-    trocla set grafana_guest_password plain
-
-Note that this might *erase* other formats for this password, although
-those will get regenerated as needed.
-
-Also note that `trocla get` will fail if the particular password or
-format requested does not exist. For example, say you generate a
-plain-text password with and then get the `bcrypt` version:
-
-    trocla create test plain
-    trocla get test bcrypt
-
-This will return the empty string instead of the hashed
-version. Instead, use `trocla create` to generate that password. In
-general, it's safe to use `trocla create` as it will reuse existing
-password. It's actually how the `trocla()` function behaves in Puppet
-as well.
-
-TODO: Trocla can provide passwords to classes transparently, without
-having to do function calls inside Puppet manifests. For example, this
-code:
-
-    class profile::grafana {
-        $password = trocla('profile::grafana::password', 'plain')
-        # ...
-    }
-
-Could simply be expressed as:
-
-    class profile::grafana(String $password) {
-        # ...
-    }
-
-But this requires a few changes:
- 
- 1. Trocla needs to be included in Hiera
- 2. We need roles to be more clearly defined in Hiera, and use Hiera
-    as an ENC so that we can do per-roles passwords (for example),
-    which is not currently possible.
-
-## Getting information from other nodes
-
-A common pattern in Puppet is to deploy resources on a given host with
-information from another host. For example, you might want to grant
-access to host A from host B. And while you can hardcode host B's IP
-address in host A's manifest, it's not good practice: if host B's IP
-address changes, you need to change the manifest, and that practice
-makes it difficult to introduce host C into the pool...
-
-So we need ways of having a node use information from other nodes in
-our Puppet manifests. There are 5 methods in our Puppet source code at
-the time of writing:
-
- * Exported resources
- * PuppetDB lookups
- * Puppet Query Language (PQL)
- * LDAP lookups
- * Hiera lookups
+### Examining a Puppet catalog

-This section walks through how each method works, outlining the
-advantage/disadvantage of each.
-
-### Exported resources
-
-Our Puppet configuration supports [exported resources](https://puppet.com/docs/puppet/latest/lang_exported.html), a key
-component of complex Puppet deployments. Exported resources allow one
-host to define a configuration that will be *exported* to the Puppet
-server and then *realized* on another host.
-
-We commonly use this to punch holes in the firewall between nodes. For
-example, this manifest in the `roles::puppetmaster` class:
-
-    @@ferm::rule::simple { "roles::puppetmaster-${::fqdn}":
-        tag         => 'roles::puppetmaster',
-        description => 'Allow Puppetmaster access to LDAP',
-        port        => ['ldap', 'ldaps'],
-        saddr       => $base::public_addresses,
-      }
-
-... exports a firewall rule that will, later, allow the Puppet server
-to access the LDAP server (hence the `port => ['ldap', 'ldaps']`
-line). This rule doesn't take effect on the host applying the
-`roles::puppetmaster` class, but only on the LDAP server, through this
-rather exotic syntax:
-
-    Ferm::Rule::Simple <<| tag == 'roles::puppetmaster' |>>
-
-This tells the LDAP server to apply whatever rule was exported with
-the `@@` syntax and the specified `tag`. Any Puppet resource can be
-exported and realized that way.
-
-Note that there are security implications with collecting exported
-resources: it delegates the resource specification of a node to
-another. So, in the above scenario, the Puppet master could decide to
-open *other* ports on the LDAP server (say, the SSH port), because it
-exports the port number and the LDAP server just blindly applies the
-directive. A more secure specification would explicitly specify the
-sensitive information, like so:
-
-    Ferm::Rule::Simple <<| tag == 'roles::puppetmaster' |>> {
-        port => ['ldap'],
-    }
-
-But then a compromised server could send a different `saddr` and
-there's nothing the LDAP server could do here: it cannot override the
-address because it's exactly the information we need from the other
-server...
-
-### PuppetDB lookups
-
-A common pattern in Puppet is to extract information from host A and
-use it on host B. The above "exported resources" pattern can do this
-for files, commands and many more resources, but sometimes we just
-want a tiny bit of information to embed in a configuration file. This
-could, in theory, be done with an exported [concat](https://forge.puppet.com/puppetlabs/concat) resource, but
-this can become prohibitively complicated for something as simple as
-an allowed IP address in a configuration file.
-
-For this we use the [puppetdbquery module](https://github.com/dalen/puppet-puppetdbquery), which allows us to do
-elegant queries against PuppetDB. For example, this will extract the
-IP addresses of all nodes with the `roles::gitlab` class applied:
-
-    $allow_ipv4 = query_nodes('Class[roles::gitlab]', 'networking.ip')
-    $allow_ipv6 = query_nodes('Class[roles::gitlab]', 'networking.ip6')
-
-This code, in `profile::kgb_bot`, propagates those variables into a
-template through the `allowed_addresses` variable, which gets expanded
-like this:
-
-    <% if $allow_addresses { -%>
-    <% $allow_addresses.each |String $address| { -%>
-        allow <%= $address %>;
-    <% } -%>
-        deny all;
-    <% } -%>
-
-Note that there is a potential security issue with that approach. The
-same way that exported resources trust the exporter, we trust that the
-node exported the right fact. So it's in theory possible that a
-compromised Puppet node exports an evil IP address in the above
-example, granting access to an attacker instead of the proper node. If
-that is a concern, consider using LDAP or Hiera lookups instead.
-
-Also note that this will eventually fail when the node goes down:
-after a while, resources are expired from the PuppetDB server and the
-above query will return an empty list. This seems reasonable: we do
-want to eventually revoke access to nodes that go away, but it's still
-something to keep in mind.
-
-Keep in mind that the `networking.ip` fact, in the above example,
-might be incorrect in the case of a host that's behind NAT. In that
-case, you should use LDAP or Hiera lookups.
-
-Note that this could also be implemented with a `concat` exported
-resource, but much harder because you would need some special case
-when no resource is exported (to avoid adding the `deny`) and take
-into account that other configurations might also be needed in the
-file. It would have the same security and expiry issues anyways.
-
-### Puppet query language
-
-Note that there's also a way to do those queries without a Forge
-module, through the [Puppet query language](https://puppet.com/docs/puppetdb/5.2/api/query/tutorial-pql.html) and the
-`puppetdb_query` function. The problem with that approach is that the
-function is not very well documented and the query syntax is somewhat
-obtuse. For example, this is what I came up with to do the equivalent
-of the `query_nodes` call, above:
-
-    $allow_ipv4 = puppetdb_query(
-      ['from', 'facts',
-        ['and',
-          ['=', 'name', 'networking.ip'],
-          ['in', 'certname',
-            ['extract', 'certname',
-              ['select_resources',
-                ['and',
-                  ['=', 'type', 'Class'],
-                  ['=', 'title', 'roles::gitlab']]]]]]])
-
-It seems like I did something wrong, because that returned an empty
-array. I could not figure out how to debug this, and apparently I
-needed more functions (like `map` and `filter`) to get what I wanted
-(see [this gist](https://gist.github.com/bastelfreak/b9620fa1892ebcc659c442b115db34f9)). I gave up at that point: the `puppetdbquery`
-abstraction is much cleaner and usable.
-
-If you are merely looking for a hostname, however, PQL might be a
-little more manageable. For example, this is how the
-`roles::onionoo_frontend` class finds its backends to setup the
-[IPsec](ipsec) network:
-
-    $query = 'nodes[certname] { resources { type = "Class" and title = "Roles::Onionoo_backend" } }'
-    $peer_names = sort(puppetdb_query($query).map |$value| { $value["certname"] })
-    $peer_names.each |$peer_name| {
-      $network_tag = [$::fqdn, $peer_name].sort().join('::')
-      ipsec::network { "ipsec::${network_tag}":
-        peer_networks => $base::public_addresses
-      }
-    }
-
-### LDAP lookups
-
-Our Puppet server is hooked up to the LDAP server and has information
-about the hosts defined there. Information about the node running the
-manifest is available in the global `$nodeinfo` variable, but there is
-also a `$allnodeinfo` parameter with information about every host
-known in LDAP.
-
-A simple example of how to use the `$nodeinfo` variable is how the
-`base::public_address` and `base::public_address6` parameters -- which
-represent the IPv4 and IPv6 public address of a node -- are
-initialized in the `base` class:
-
-    class base(
-      Stdlib::IP::Address $public_address            = filter_ipv4(getfromhash($nodeinfo, 'ldap', 'ipHostNumber'))[0],
-      Optional[Stdlib::IP::Address] $public_address6 = filter_ipv6(getfromhash($nodeinfo, 'ldap', 'ipHostNumber'))[0],
-    ) {
-      $public_addresses = [ $public_address, $public_address6 ].filter |$addr| { $addr != undef }
-    }
-
-This loads the `ipHostNumber` field from the `$nodeinfo` variable, and
-uses the `filter_ipv4` or `filter_ipv6` functions to extract the IPv4
-or IPv6 addresses respectively.
-
-A good example of the `$allnodeinfo` parameter is how the
-`roles::onionoo_frontend` class finds the IP addresses of its
-backend. After having loaded the host list from PuppetDB, it then uses
-the parameter to extract the IP address:
-
-    $backends = $peer_names.map |$name| {
-        [
-          $name,
-          $allnodeinfo[$name]['ipHostNumber'].filter |$a| { $a =~ Stdlib::IP::Address::V4 }[0]
-        ] }.convert_to(Hash)
-
-Such a lookup is considered more secure than going through PuppetDB as
-LDAP is a trusted data source. It is also our source of truth for this
-data, at the time of writing.
-
-### Hiera lookups
-
-For more security-sensitive data, we should use a trusted data source
-to extract information about hosts. We do this through Hiera lookups,
-with the [lookup](https://puppet.com/docs/puppet/latest/function.html#lookup) function. A good example is how we populate the
-SSH public keys on all hosts, for the admin user. In the
-`profile::ssh` class, we do the following:
+It can sometimes be useful to examine a node's catalog in order to
+determine if certain resources are present, or to view a resource's
+full set of parameters.

-    $keys = lookup('profile::admins::keys', Data, 'hash')
+#### List resources by type

-This will lookup the `profile::admin::keys` field in Hiera, which is a
-trusted source because under the control of the Puppet git repo. This
-refers to the following data structure in `hiera/common.yaml`:
+To list all `service` resources managed by Puppet on a node, the
+command below may be executed on the node itself:

-    profile::admins::keys:
-      anarcat:
-        type: "ssh-rsa"
-        pubkey: "AAAAB3[...]"
+    puppet catalog select --terminus rest "$(hostname -f)" service

-The key point with Hiera is that it's a "hierarchical" data structure,
-so each host can have its own override. So in theory, the above keys
-could be overridden per host. Similarly, the IP address information for
-each host could be stored in Hiera instead of LDAP. But in practice,
-we do not currently do this and the per-host information is limited.
+At the end of the command line, `service` may be replaced by any
+built-in resource types such as `file` or `cron`. Defined resource
+names may also be used here, like `ssl::service`.

-## Revoking and generating a new certificate for a host
+#### View/filter full catalog

-Revocation procedures problems were discussed in [33587][] and [33446][].
+To extract a node's full catalog in JSON format:

-[33587]: https://bugs.torproject.org/33587
-[33446]: https://gitlab.torproject.org/legacy/trac/-/issues/33446#note_2349434
+    puppet catalog find --terminus rest "$(hostname -f)"

- 1. Clean the certificate on the master
+The output can be manipulated using `jq` to extract more precise
+information. For example, to list all resources of a specific type:

-        puppet cert clean host.torproject.org
+    jq '.resources[] | select(.type == "File") | .title' < catalog.json

- 2. Clean the certificate on the client:
+To list all classes in the catalog:

-        find /var/lib/puppet/ssl -name host.torproject.org.pem -delete
+    jq '.resources[] | select(.type=="Class") | .title' < catalog.json

- 3. Then run the bootstrap script on the client from
-    `tsa-misc/installer/puppet-bootstrap-client` and get a new checksum
+To display a specific resource selected by title:

- 4. Run `tpa-puppet-sign-client` on the master and pass the checksum
+    jq '.resources[] | select((.type == "File") and (.title=="sources.list.d"))' < catalog.json

- 5. Run `puppet agent -t` to have puppet running on the client again.
+More examples can be found on this [blog post](http://web.archive.org/web/20210122003128/https://alexharv074.github.io/puppet/2017/11/30/jq-commands-for-puppet-catalogs.html).OB

 ## Pager playbook