title: DNS procedures
How to
Most operations on DNS happens in the admin/dns/domains
repository
(git@git-rw.torproject.org:admin/dns/domains
). Those zones contains
the master copy of the zone files, stored as (mostly) standard Bind zonefiles
(RFC 1034), but notably without a SOA.
Tor's DNS support is fully authenticated with DNSSEC, both to the outside world but also internally, where all TPO hosts use DNSSEC in their resolvers.
Editing a zone
Zone records can be added or modified to a zone in the domains
git
and a push. DNSSEC records are managed automatically by
manage-dnssec-keys
in the dns/dns-helpers
git repository. through
a cron job in the dnsadm
user on the master DNS server (currently
nevii).
Serial numbers are managed automatically by the git repository hooks.
Adding a zone
To add a new zone to our infrastructure, the following procedure must be followed:
- add zone in
dns/domains.git
repository - add zone in the
modules/bind/templates/named.conf.torproject-zones.erb
Puppet template for DNS secondaries to pick up the zone - also add IP address ranges (if it's a reverse DNS zone file) to
modules/torproject_org/misc/hoster.yaml
in thetor-puppet.git
repository - run puppet on DNS servers:
cumin 'C:roles::dns_primary or C:bind::secondary' 'puppet agent -t'
- add zone to
modules/postfix/files/virtual
, unless it is a reverse zonefile - add zone to nagios: copy an existing
DNS SOA sync
block and adapt - add zone to external DNS secondaries (currently Netnod)
- make sure the zone is delegated by the root servers somehow. for normal zones, this involves adding our nameservers in the registrar's configuration. for reverse DNS, this involves asking our upstreams to delegate the zone to our DNS servers.
Note that this is a somewhat rarer procedure: this happens only when a
completely new domain name (e.g. torproject.net
) or IP address
space (so reverse DNS, e.g. 38.229.82.0/24
AKA
82.229.38.in-addr.arpa
) is added to our infrastructure.
Removing a zone
-
git grep the domain in the
tor-nagios
git repository -
remove the zone in the
dns/domains.git
repository -
on nevii, remove the generated zonefiles and keys:
cd /srv/dns.torproject.org/var/keys/ mv generated/torproject.fr* OLD-generated/ mv keys/torproject.fr OLD-KEYS/
-
remove the zone from the secondaries (Netnod and our own servers). this means visiting the Netnod web interface for that side, and Puppet (
modules/bind/templates/named.conf.torproject-zones.erb
) for our own -
the domains will probably be listed in other locations, grep Puppet for Apache virtual hosts and email aliases
-
the domains will also probably exist in the
letsencrypt-domains
repository
DS records expiry and renewal
A special case is the rotation of the DNSKEY
/ DS
records. Those
rotate about once every two years, and require manual operation on the
registrar (currently https://joker.com).
A Nagios hook is in /srv/dns.torproject.org/bin/dsa-check-and-extend-DS
, and
basically wraps manage-dnssec-keys
with some Nagios status codes. It
will warn when the key is about to expire and extend it before it
expires (while still flagging a critical warning in Nagios).
To fix this error, you need to visit joker.com and authenticate
with the password in hosts-extra-info
in tor-passwords, along with
the 2FA dance. Then:
- click on the gear next to the domain affected
- edit the DNSSEC section
- click "more" to add a record
The new key should already be present on the DNS master (currently
nevii
) in:
/srv/dns.torproject.org/var/keys/$DOMAIN/dsset
It is in the format (from rfc4034):
domain IN DS keytag algo type digest
For example:
torproject.net. IN DS 53722 8 2 6d3d2be639594ffe34d4c5b9214fe5ddf81b8ee1c8505f5ec1a800dc4a809a91; Pub: 2019-05-25 17:40:08; Act: 2019-05-25 17:40:08; Inact: 2021-09-11 17:40:08; Del: 2021-09-11 17:40:08; Rev: 2021-08-12 17:40:08
With the above, you would have the following in Joker:
-
alg
: 8 -
digest
: 6d3d2be639594ffe34d4c5b9214fe5ddf81b8ee1c8505f5ec1a800dc4a809a91 -
type
: 2 -
keytag
: 53722
And click "save".
After a little while, you should be able to check if the new DS record works on DNSviz.net, for example, the DNSviz.net view of torproject.net should be sane.
The changes will take a while (~10 hours?) to trickle out into all caches, so it might take a while for the Nagios check to return green.
Eventually, Nagios will complain about the old keys, and we can remove
them. Make sure to remoce the old key, not the new key. Be careful
because the web interface might sort the keys in an unexpected
way. check the keytag and compare with the expiration specified in the
dsset
file.
Note: this procedure could be automated by talking with the registrar's API, for example Joker.com's DMAPI domain modification API (see also those docs). There are also proposals at the IETF to allow delegation from the parent zone to allow the child zone to perform those updates on its own.
Further, puppet ships trust anchors for some of our zones to our unbounds. If
you updated the DS for one of those, update the corresponding file in
tsa-puppet/modules/unbound/files
. Existing machines don't need that (since
we do slow, RFC5011-style rolling of KSKs), but new instances will be sad if we
ship them obsolete trust anchors.
Special case: RFC1918 zones
The above is for public zones, for which we have Nagios checks that
warn us about impeding doom. But we also sign zones about reverse IP
looks, specifically 30.172.in-addr.arpa.
Normally, recursive
nameservers pick new signatures in that zone automatically, thanks to
rfc 5011.
But if a new host gets provisionned, it needs to get bootstrapped somehow. This is done by Puppet, but those records are maintained by hand and will get out of date. This implies that after a while, you will start seeing messages like this for hosts that were installed after the expiration date:
16:52:39 <nsa> tor-nagios: [submit-01] unbound trust anchors is WARNING: Warning: no valid trust anchors found for 30.172.in-addr.arpa.
The solution is to go on the primary nameserver (currently nevii
)
and pick the non-revoked DSSET line from this file:
/srv/dns.torproject.org/var/keys/30.172.in-addr.arpa/dsset
... and inject it in Puppet, in:
tor-puppet/modules/unbound/files/30.172.in-addr.arpa.key
Then new hosts will get the right key and bootstrap properly. Old hosts can get the new key by removing the file by hand on the server and re-running Puppet:
rm /var/lib/unbound/30.172.in-addr.arpa.key ; puppet agent -t
Pager playbook
DNS - zones signed properly is CRITICAL
When adding a new reverse DNS zone, it's possible you get this warning from Nagios:
13:31:35 <nsa> tor-nagios: [global] DNS - zones signed properly is CRITICAL: CRITICAL: 82.229.38.in-addr.arpa
16:30:36 <nsa> tor-nagios: [global] DNS - key coverage is CRITICAL: CRITICAL: 82.229.38.in-addr.arpa
That might be because Nagios thinks this zone should be signed (while it isn't and cannot). The fix is to add this line to the zonefile:
; ds-in-parent = no
And push the change. Nagios should notice and stop caring about the zone.
In general, this Nagios check provides a good idea of the DNSSEC chain of a zone:
$ /usr/lib/nagios/plugins/dsa-check-dnssec-delegation overview 82.229.38.in-addr.arpa
zone DNSKEY DS@parent DLV dnssec@parent
--------------------------- -------------------- --------------- --- ----------
82.229.38.in-addr.arpa no(229.38.in-addr.arpa), no(38.in-addr.arpa), yes(in-addr.arpa), yes(arpa), yes(.)
Notice how the 38.in-addr.arpa
zone is not signed? This zone can
therefore not be signed with DNSSEC.
DNS - delegation and signature expiry is WARNING
If you get a warning like this:
13:30:15 <nsa> tor-nagios: [global] DNS - delegation and signature expiry is WARNING: WARN: 1: 82.229.38.in-addr.arpa: OK: 12: unsigned: 0
It might be that the zone is not delegated by upstream. To confirm, run this command on the Nagios server:
$ /usr/lib/nagios/plugins/dsa-check-zone-rrsig-expiration 82.229.38.in-addr.arpa
ZONE WARNING: No RRSIGs found; (0.66s) |time=0.664444s;;;0.000000
On the primary DNS server, you should be able to confirm the zone is signed:
dig @nevii -b 127.0.0.1 82.229.38.in-addr.arpa +dnssec
Check the next DNS server up (use dig -t NS
to find it) and see if
the zone is delegated:
dig @ns1.cymru.com 82.229.38.in-addr.arpa +dnssec
If it's not delegated, it's because you forgot step 8 in the zone addition procedure. Ask your upstream or registrar to delegate the zone and run the checks again.
Discussion
Design
This needs to be documented better. weasel made a blog post describing parts of the infrastructure on Debian.org, and that is partly relevant to TPO as well.
TODO: adapt this document to the service template.
Automation
Debian has a set of scripts to automate talking to some providers like Netnod. A YAML file has metadata about the configuration, and pushing changes is as simple as:
publish tor-dnsnode.yaml
That config file would look something like:
---
endpoint: https://dnsnodeapi.netnod.se/apiv3/
base_zone:
endcustomer: "TorProject"
masters:
# nevii.torproject.org
- ip: "49.12.57.130"
tsig: "netnod-torproject-20180831."
- ip: "2a01:4f8:fff0:4f:266:37ff:fee9:5df8"
tsig: "netnod-torproject-20180831."
product: "probono-premium-anycast"
This is not currently in use at TPO and changes are operated manually through the web interface.