I spent some time this afternoon with cda, doing Tor handshakes from inside Iran. The handshake completed, but then the TCP connection got cut, when the SSL cert had a lifetime of 365 days.
When I changed the 365 to 65 in or.h, on the bridge, the TCP connection survived.
(But that wasn't sufficient, since for some reason the directory request wasn't getting through, or the response wasn't getting through.)
In any case, we should take steps to randomize our SSL link cert lifetime.
This is the follow-on ticket to #4014 (moved) (which we knew we'd need to do one day, and this is the day).
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
(I picked 0.2.4 as the target milestone, rather than 0.2.3, since this fix doesn't fully solve the problem. Once we have sufficient fixes, we can re-assess.)
Incidentally, somebody with new enough crypto libs on both sides should check to see how our new ECC link handshake (added in 0.2.4.8-alpha) fares here. I believe I don't have new enough crypto on my side, and I don't know whether cda does.
I did a little spot-checking to see whether it was more usual for certificate to have notbefore/notafter times at more or less random intervals, or to have notbefore/notafter times of an exact duration, or to have them more or less at random.
I checked 4 or 5 well-known websites and found that their certificates in the wild are all over the map. More research could be warranted.
where the notBefore time is 121017000000Z (that is, 2012-10-17 00:00:00 GMT) and the notAfter time is 131018235959Z (that is, 2013-10-18 23:59:59 GMT).
Those are the intervals I got for amazon. I found other stuff too. We could stand to do a little more spot checking before we settle on 'random' IMO. Nearly nobody has a 1-day lifetime AFAICT.
Is it guaranteed that we'll get a new link certificate at least daily? If not, the "one day" minimum lifetime is too short.
The patch looks okay otherwise, but it needs a patch to tor.1.txt to accompany it.
Cleared the torrc changes from this afternoon and confirmed Tor v0.2.2.35 would not progress past
{{{
Mar 10 04:11:23.195 [notice] Bootstrapped 10%: Finishing handshake with directory server.
}}}
Added your bridge and would stall at:
{{{
Mar 10 04:15:15.180 [notice] Bootstrapped 50%: Loading relay descriptors.
Mar 10 04:15:22.186 [notice] new bridge descriptor 'bridge2' (fresh)
Mar 10 04:15:22.186 [notice] I learned some more directory information, but not enough to build a circuit: We have no network-status consensus.
Mar 10 04:16:36.103 [notice] I learned some more directory information, but not enough to build a circuit: We have no network-status consensus.
Mar 10 04:16:36.698 [notice] I learned some more directory information, but not enough to build a circuit: We have only 0/3169 usable descriptors.
}}}
Updated to v0.2.4.10-alpha, was able to open a circuit.
Removed bridge, cleared /var/lib/tor; could not progress past 10%.
One year seems to be the most popular life time. Simply dropping such certificates would imply a large collateral damage, so there is probably something else we are missing so far.
Out of all 1,533,359 certificates, 497,650 (~32%) have a life time which does not end in 0:00:00
or 23:59:59. A couple thousand are close to these values, but most of the 32% are all over the place. These could be called "random life times".
The above is just an unbiased view on all certificates. We should also consider well-known and important web sites to Iran, as nickm did above.
phw -- while you're at it, what's the distribution on time of day at which certs start and end? I bet that a large number start or end at the start or end of a day, and a large number start or end at the start or end of an hour.
would be wise, and sufficient to get rid of my "gosh, your cert was born within the past 2 hours" worry. It's sort of a hack though -- it makes your cert valid for 1 to 365 days in the
future, and 0 to that-previous-number days in the past.
I think that would combine nicely with your previous patch: determine the lifetime, then determine how far into the lifetime we are. But see also phw and my comments above
phw -- while you're at it, what's the distribution on time of day at which certs start and end? I bet that a large number start or end at the start or end of a day, and a large number start or end at the start or end of an hour.
When ignoring year/month/day, 42% of all certificates start at 00:00:00 and end at 23:59:59.
42% of all certificates end at x:59:59. There's only a negligible amount of end times other than x:59:59. Pretty much the same applies to the start times - just that the time is x:00:00, of course.
So it looks like your first guess is true. Starting and ending around midnight is very popular. Your second guess does not seem to be true, though. The amount of certificates starting or ending around the start/end of an hour (+/- 1 second) other than midnight is < 0.6%.
Ok, I added a man page entry, made us start part-way through the lifetime period, clipped defaults to start and end on day boundaries, and flipped a coin to decide if we end at 23:59:59 or at midnight.
(It seems we're screwed either way here, if the new firewall strategy is to look for a collection of properties. By sticking to the day boundary we're blending in better but still reducing our entropy. By not sticking to the day boundary we blend in worse, but at first glance we're harder to fingerprint. The trouble is that the new fingerprint should be "X, Y, and also doesn't use a day boundary". This is a good example of why playing the "look like ssl" arms race is unwinnable.)
In addition to the order of operations issue here, are we sure it actually produces the result we expect? Did you look at these certificates and make sure they come out right? I guess they must, since leap-seconds aren't included in time_t if I understand correctly.
Other than that, it looks okay to me. If it test out okay, I say we at least merge into 0.2.4.