This issue has been waiting for information two
weeks or more. It needs attention. Please take care of
this before the end of
2024-06-26. ~"Needs
Information" tickets will be moved to the Icebox after
that point.
(Any ticket left in Needs Review, Needs Information, Next, or Doing
without activity for 14 days gets such
notifications. Make a comment describing the current state
of this ticket and remove the Stale label to fix this.)
To make the bot ignore this ticket, add the bot-ignore label.
@weasel i'm having issuing a new cert from letsencrypt-domains.git here:
root@nevii:~# /srv/letsencrypt.torproject.org/bin/dehydrated-wrap --cron[...]Processing bridges-email.torproject.org + Signing domains... + Generating private key... + Generating signing request... + Requesting new certificate order from CA... + Received 1 authorizations URLs from the CA + Handling authorization for bridges-email.torproject.org + Found valid authorization for bridges-email.torproject.org + 0 pending challenge(s) + Requesting certificate...ERROR: Problem connecting to server (post for ; curl returned with 3)
(the 0 pending challenge(s) is because i did try to issue the cert before and it properly populated DNS, but failed similarly.)
any idea what that might be about?
or are we ready to switch to puppet-only issuance now, what's our status here?
full, first run:
Click to expand
$ git … push -v origin refs/heads/master\:refs/heads/masterPushing to nevii.torproject.org:/srv/letsencrypt.torproject.org/repositories/letsencrypt-domainsWriting objects: 100% (3/3), 523 bytes | 523.00 KiB/s, done.Total 3 (delta 2), reused 0 (delta 0), pack-reused 0remote: in post-receive hook remote: # INFO: Using main config file /srv/letsencrypt.torproject.org/etc/dehydrated-config remote: Processing torproject.org remote: + Checking domain name(s) of existing cert... unchanged. remote: + Checking expire date of existing cert... remote: + Valid till Aug 22 00:53:51 2024 GMT (Longer than 30 days). Skipping renew! remote: Processing www.torproject.org remote: + Checking domain name(s) of existing cert... unchanged. remote: + Checking expire date of existing cert... remote: + Valid till Aug 7 00:13:49 2024 GMT (Longer than 30 days). Skipping renew! remote: Processing anonticket.torproject.org remote: + Checking domain name(s) of existing cert... unchanged. remote: + Checking expire date of existing cert... remote: + Valid till Jul 29 01:01:27 2024 GMT (Longer than 30 days). Skipping renew! remote: Processing archive.torproject.org remote: + Checking domain name(s) of existing cert... unchanged. remote: + Checking expire date of existing cert... remote: + Valid till Jul 30 00:49:07 2024 GMT (Longer than 30 days). Skipping renew! remote: Processing arti.torproject.org remote: + Checking domain name(s) of existing cert... unchanged. remote: + Checking expire date of existing cert... remote: + Valid till Jul 17 00:29:54 2024 GMT (Longer than 30 days). Skipping renew! remote: Processing atlas.torproject.org remote: + Checking domain name(s) of existing cert... unchanged. remote: + Checking expire date of existing cert... remote: + Valid till Jul 30 00:49:50 2024 GMT (Longer than 30 days). Skipping renew! remote: Processing aus1.torproject.org remote: + Checking domain name(s) of existing cert... unchanged. remote: + Checking expire date of existing cert... remote: + Valid till Aug 1 00:28:51 2024 GMT (Longer than 30 days). Skipping renew! remote: Processing aus2.torproject.org remote: + Checking domain name(s) of existing cert... unchanged. remote: + Checking expire date of existing cert... remote: + Valid till Aug 1 00:29:11 2024 GMT (Longer than 30 days). Skipping renew! remote: Processing blog.torproject.org remote: + Checking domain name(s) of existing cert... unchanged. remote: + Checking expire date of existing cert... remote: + Valid till Aug 9 00:57:34 2024 GMT (Longer than 30 days). Skipping renew! remote: Processing bridges.torproject.org remote: + Checking domain name(s) of existing cert... unchanged. remote: + Checking expire date of existing cert... remote: + Valid till Aug 26 00:18:47 2024 GMT (Longer than 30 days). Skipping renew! remote: Processing bridges-test.torproject.org remote: + Checking domain name(s) of existing cert... unchanged. remote: + Checking expire date of existing cert... remote: + Valid till Aug 7 00:14:00 2024 GMT (Longer than 30 days). Skipping renew! remote: Processing bridges-email.torproject.org remote: + Creating new directory /srv/letsencrypt.torproject.org/etc/certs/bridges-email.torproject.org ... remote: + Signing domains... remote: + Generating private key... remote: + Generating signing request... remote: + Requesting new certificate order from CA... remote: + Received 1 authorizations URLs from the CA remote: + Handling authorization for bridges-email.torproject.org remote: + 1 pending challenge(s) remote: + Deploying challenge tokens... remote: Adding challenge '_acme-challenge.bridges-email.torproject.org. 60 IN TXT "Xe_dA6xV3qGnBbnRiIwpEZ9Jo_5zkhxW8fDWZI6JH6M"' for bridges-email.torproject.org. remote: 2024-06-12 19:24:03 /srv/dns.torproject.org/bin/update: ***** start of script ***** remote: 2024-06-12 19:24:03 /srv/dns.torproject.org/bin/update: pre flock remote: 2024-06-12 19:24:03 /srv/dns.torproject.org/bin/update: pre update-keys remote: 2024-06-12 19:24:04 /srv/dns.torproject.org/bin/update: pre build-services remote: 2024-06-12 19:24:04 /srv/dns.torproject.org/bin/update: pre for loop remote: 2024-06-12 19:24:04 /srv/dns.torproject.org/bin/update: pre write_zonefile for 0.0.0.0.2.0.0.6.7.0.0.0.0.2.6.2.ip6.arpa remote: 2024-06-12 19:24:04 /srv/dns.torproject.org/bin/update: pre write_zonefile for 30.172.in-addr.arpa remote: 2024-06-12 19:24:04 /srv/dns.torproject.org/bin/update: pre write_zonefile for 99.8.204.in-addr.arpa remote: 2024-06-12 19:24:04 /srv/dns.torproject.org/bin/update: pre write_zonefile for onion-router.net remote: 2024-06-12 19:24:04 /srv/dns.torproject.org/bin/update: pre write_zonefile for rev remote: 2024-06-12 19:24:04 /srv/dns.torproject.org/bin/update: pre write_zonefile for torproject.com remote: 2024-06-12 19:24:04 /srv/dns.torproject.org/bin/update: pre write_zonefile for torproject.net remote: 2024-06-12 19:24:04 /srv/dns.torproject.org/bin/update: pre write_zonefile for torproject.org remote: 2024-06-12 19:24:04 /srv/dns.torproject.org/bin/update: pre dns-update remote: 2024-06-12 19:24:05 /srv/dns.torproject.org/bin/update: done! remote: 2024-06-12 19:24:05 /srv/dns.torproject.org/bin/update: ***** end of script ***** remote: Waiting for master to update torproject.org (for _acme-challenge.bridges-email.torproject.org) from 2024061201. Currently at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: SOA nevii.torproject.org. hostmaster.torproject.org. 2024061202 10800 3600 1814400 3601 from server 49.12.57.135 in 0 ms. remote: SOA nevii.torproject.org. hostmaster.torproject.org. 2024061201 10800 3600 1814400 3601 from server 194.58.198.32 in 27 ms. remote: SOA nevii.torproject.org. hostmaster.torproject.org. 2024061202 10800 3600 1814400 3601 from server 89.47.185.6 in 31 ms. remote: SOA nevii.torproject.org. hostmaster.torproject.org. 2024061202 10800 3600 1814400 3601 from server 204.8.99.145 in 131 ms. remote: SOA nevii.torproject.org. hostmaster.torproject.org. 2024061202 10800 3600 1814400 3601 from server 204.8.99.145 in 131 ms. remote: Waiting for secondaries to update to match master at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: Waiting for secondaries to update to match master at 2024061202.. remote: + Responding to challenge for bridges-email.torproject.org authorization... remote: + Challenge is valid! remote: + Cleaning challenge tokens... remote: + Requesting certificate... remote: ERROR: Problem connecting to server (post for ; curl returned with 3) To nevii.torproject.org:/srv/letsencrypt.torproject.org/repositories/letsencrypt-domains e6762f5..ab7e53f master -> masterupdating local tracking ref 'refs/remotes/origin/master'
un-commenting -s in a curl call in dehydrated (!?) led me to understand that we passed some rate limit:
++ /srv/letsencrypt.torproject.org/bin/le-hook request_failure 429 '{ "type": "urn:ietf:params:acme:error:rateLimited", "detail": "Error creating new order :: too many certificates (5) already issued for this exact set of domains in the last 168 hours: bridges-email.torproject.org, retry after 2024-06-14T04:23:08Z: see https://letsencrypt.org/docs/duplicate-certificate-limit/", "status": 429}' post 'HTTP/2 429
so i guess we're stuck for this domains for two days, not sure wtf happened here.
right, so i reverted that and the forum-01 (which is another cert that was broken) went through, so that was probably not related to the issue at hand here...
still: could we get a better idea of where we're at with this project here? are there any puppet-issued certs actually in use, and should we start using puppet to issue certs?
as an aside, i must say i'm a little worried about the implementation here... i notice we still use the old le-hook mechanics from nevii and the whole rsync from nevii approach, i was approach to have a cleaner, more puppetized design...
i'm also concerned about the error messages dehydrated was giving me on failures, and the ten certs it generated (and then failed to pull!) from LE during #41610... it seems we have pretty bad failure modes there, and debugging this basically involves tracing a bash script...
still: could we get a better idea of where we're at with this project
here? are there any puppet-issued certs actually in use, and should we
start using puppet to issue certs?
I don't think there are any actually in use, but we totally should start
doing things here. I might be able to spend some time on that tomorrow.
as an aside, i must say i'm a little worried about the implementation
here... i notice we still use the old le-hook mechanics from nevii
and the whole rsync from nevii approach, i was approach to have a
cleaner, more puppetized design...
We use the le-hook to rebuild the zones and wait for them to propagate.
We do not use the rsync from nevii approach here at all; certs ship via
puppetdb.
I modified our ssl::service to actually use and install the certificate we get via puppet's dehydrated::certificate if the dehydrated parameter to ssl::service is true.
I added dehydrated => true to our ssl::certificate resource for nagios.torproject.org.
Overall things went smoothly but I have identified the following issues we may want to address:
dehydrated::certificates creates RSA 2048 bit keys currently. We might want to switch to RSA 4096, P-256, or P-384. Opinions?
On the first puppet run with dehydrated set to true, we do not have the new certificate yet. Therefore, puppet did not put a certificate into /etc/ssl/torproject/certs/nagios.torproject.org.crt, and given that /etc/ssl/torproject/certs is a directory we clean up automatically (cf. our ssl class), the existing certificate was removed; apache was reloaded as the ssl::service for nagios has a notify to apache. Not a big deal in this case as apache handles certs+keys being gone while it's already running gracefully, but maybe not ideal in other cases? Any ideas on how to handle this? Should we care?
We do not correctly roll over the TLSA records for port 443 the way we do things right now. I.e. we switch the DNS record pretty much at the same time as we generate the new key and switch to certificates. If anyone actually used the DNS DANE stuff we would need to properly pre-announce the new key's fingerprint and only switch after a TTL. (or stop publishing beforehand.) I think this is a non-issue as I'm not aware of anyone really using DANE for https traffic. Does anyone disagree here?
I added dehydrated => true to our ssl::certificate resource for nagios.torproject.org.
Overall things went smoothly but I have identified the following issues we may want to address:
dehydrated::certificates creates RSA 2048 bit keys currently. We might want to switch to RSA 4096, P-256, or P-384. Opinions?
I'd be go on RSA 4096 bits, not sure what compatibility is expected for
ECC out there, do we have numbers on that?
Maybe add that on top of the #32351 pile and keep the current
(letsencrypt-domains.git) settings for now, which is RSA 4k?
On the first puppet run with dehydrated set to true, we do not have the new certificate yet. Therefore, puppet did not put a certificate into /etc/ssl/torproject/certs/nagios.torproject.org.crt, and given that /etc/ssl/torproject/certs is a directory we clean up automatically (cf. our ssl class), the existing certificate was removed; apache was reloaded as the ssl::service for nagios has a notify to apache. Not a big deal in this case as apache handles certs+keys being gone while it's already running gracefully, but maybe not ideal in other cases? Any ideas on how to handle this? Should we care?
This is a typical TLS bootstrap problem we've been having in the current
setup as well. Typically, the scenario is "add the vhost to puppet, aand
oops I forgot to do the letsencrypt-domains.git bit, now the catalog is
broken". But normally, we do get a clean, one-shot catalog run if we do
things right, right now.
So it's kind of unfortunate to have a "fail on first run" scenario.
Is this only on the transition from letsencrypt-domains.git to puppet?
We do not correctly roll over the TLSA records for port 443 the way we do things right now. I.e. we switch the DNS record pretty much at the same time as we generate the new key and switch to certificates. If anyone actually used the DNS DANE stuff we would need to properly pre-announce the new key's fingerprint and only switch after a TTL. (or stop publishing beforehand.) I think this is a non-issue as I'm not aware of anyone really using DANE for https traffic. Does anyone disagree here?
Agreed, I doubt DANE + HTTPS is a thing at all.
...
On 2024-06-14 09:23:19, weasel wrote:
--
Antoine Beaupré
torproject.org system administration