diff --git a/howto/puppet.md b/howto/puppet.md index 0c943b3bf0834e5e272781212c3a4363285b710c..21eb0cb447a59a26d7efd3b162f54c3a77a68069 100644 --- a/howto/puppet.md +++ b/howto/puppet.md @@ -708,6 +708,59 @@ neded more functions (like `map` and `filter`) to get what I wanted (see [this gist](https://gist.github.com/bastelfreak/b9620fa1892ebcc659c442b115db34f9)). I gave up at that point: the `puppetdbquery` abstraction is much cleaner and usable. +If you are merely looking for a hostname, however, PQL might be a +little more manageable. For example, this is how the +`roles::onionoo_frontend` class finds its backends to setup the +[IPsec](ipsec) network: + + $query = 'nodes[certname] { resources { type = "Class" and title = "Roles::Onionoo_backend" } }' + $peer_names = sort(puppetdb_query($query).map |$value| { $value["certname"] }) + $peer_names.each |$peer_name| { + $network_tag = [$::fqdn, $peer_name].sort().join('::') + ipsec::network { "ipsec::${network_tag}": + peer_networks => $base::public_addresses + } + } + +### LDAP lookups + +Our Puppet server is hooked up to the LDAP server and has information +about the hosts defined there. Information about the node running the +manifest is available in the global `$nodeinfo` variable, but there is +also a `$allnodeinfo` parameter with information about every host +known in LDAP. + +A simple example of how to use the `$nodeinfo` variable is how the +`base::public_address` and `base::public_address6` parameters -- which +represent the IPv4 and IPv6 public address of a node -- are +initialized in the `base` class: + + class base( + Stdlib::IP::Address $public_address = filter_ipv4(getfromhash($nodeinfo, 'ldap', 'ipHostNumber'))[0], + Optional[Stdlib::IP::Address] $public_address6 = filter_ipv6(getfromhash($nodeinfo, 'ldap', 'ipHostNumber'))[0], + ) { + $public_addresses = [ $public_address, $public_address6 ].filter |$addr| { $addr != undef } + } + +This loads the `ipHostNumber` field from the `$nodeinfo` variable, and +uses the `filter_ipv4` or `filter_ipv6` functions to extract the IPv4 +or IPv6 addresses respectively. + +A good example of the `$allnodeinfo` parameter is how the +`roles::onionoo_frontend` class finds the IP addresses of its +backend. After having loaded the host list from PuppetDB, it then uses +the parameter to extract the IP address: + + $backends = $peer_names.map |$name| { + [ + $name, + $allnodeinfo[$name]['ipHostNumber'].filter |$a| { $a =~ Stdlib::IP::Address::V4 }[0] + ] }.convert_to(Hash) + +Such a lookup is considered more secure than going through PuppetDB as +LDAP is a trusted data source. It is also our source of truth for this +data, at the time of writing. + ### Hiera lookups For more security-sensitive data, we should use a trusted data source @@ -733,10 +786,6 @@ could be overriden per host. Similarly, the IP address information for each host could be stored in Hiera instead of LDAP. But in practice, we do not currently do this and the per-host information is limited. -### LDAP lookups - -TODO. - ## Revoking and generating a new certificate for a host Revocation procedures problems were discussed in [33587][] and [33446][]. @@ -842,13 +891,22 @@ backing the PuppetDB server as well. It's *possible* this step *could* be skipped in an emergency, because most of the information in PuppetDB is a cache of exported resources, reports and facts. But it could also break hosts and make converging the infrastructure -impossible, as there might be dependency loops in exported resources -(for example, the Puppet server needs access to the LDAP server, and -that is configured in Puppet). +impossible, as there might be dependency loops in exported resources. + +In particular, the Puppet server needs access to the LDAP server, and +that is configured in Puppet. So if the Puppet server needs to be +rebuilt from scratch, it will need to be manually allowed access to +the LDAP server to compile its manifest. So it is strongly encouraged to restore the PuppetDB server database as well in case of disaster. +This also applies in case of an IP address change of the Puppet +server, in which case access to the LDAP server needs to be manually +granted before the configuration can run and converge. This is a known +bootstrapping issue with the Puppet server and is further discussed in +the [design section](#LDAP-integration). + # Reference This documents generally how things are setup. @@ -1072,10 +1130,51 @@ Puppet itself, currently as part of the `torproject_org` module. ### LDAP integration -TODO: document how Puppet talks with LDAP (and vice-versa?). Note that -this is from a design perspective (ie. firewalls, access controls, -passwords, etc), not from a "user" perspective (ie. how to actually do -it in the Puppet code). +The Puppet is configured to talk with Puppet through a few custom +functions defined in +`modules/puppetmaster/lib/puppet/parser/functions`. The main plumbing +function is called `ldapinfo()` and connects to the LDAP server +through `db.torproject.org` over TLS on port 636. It takes a hostname +as an argument and will load all hosts matching that pattern under the +`ou=hosts,dc=torproject,dc=org` subtree. If the specified hostname is +the `*` wildcard, the result will be a hash of `host => hash` entries, +otherwise only the `hash` describing the provided host will be +returned. + +The `nodeinfo()` function uses that function to populate the global +`$nodeinfo` hash available globally, or, more specifically, the +`$nodeinfo['ldap']` component. It also loads the `$nodeinfo['hoster']` +value from the `whohosts()` function. That function, in turn, tries to +match the IP address of the host against the "hosters" defined in the +`hoster.yaml` file. + +The `allnodeinfo()` function does a similar task as `nodeinfo()`, +except that it loads *all* nodes from LDAP, into a single hash. It +does *not* include the "hoster" and is therefore equivalent to calling +`nodeinfo()` on each host and extracting only the `ldap` member hash +(although it is not implemented that way). + +Puppet does not require any special credentials to access the LDAP +server. It accesses the LDAP database anonymously, although there is a +firewall rule (defined in Puppet) that grants it access to the LDAP +server. + +There is a bootstrapping problem there: if one would be to rebuild the +Puppet server, it would actually fail to compile its catalog because +it would not be able to connect to the LDAP server to fetch its +catalog, unless the LDAP server has been manually configured to let +the Puppet server through. + +NOTE: much (if not all?) of this is being moved into Hiera, in +particular the YAML files. See [issue 30020](https://trac.torproject.org/projects/tor/ticket/30020) for details. Moving +the host information into Hiera would resolve the bootstrapping +issues, but would require, in turn some more work to resolve questions +like how users get granted access to individual hosts, which is +currently managed by `ud-ldap`. We cannot, therefore, simply move host +information from LDAP into Hiera without creating a duplicate source +of truth without rebuilding or tweaking the user distribution +system. See also the [LDAP design document](ldap#Design) for more information +about how LDAP works. ### External data sources