Raw import from Trac using Trac markup language. authored by Alexander Hansen Færøy's avatar Alexander Hansen Færøy
[Nick is writing this document as a summary of Tor's TLS annoyances, for the consumption of people who are working on the TLS protocol and would like to help Tor. It's a draft; I don't promise that it's accurate.]
For a few reasons, TLS as it stands is a difficult protocol for Tor to use.
== REASON 1: IT IS HARD TO LOOK LIKE OTHER TLS INSTANCES FROM THE POV OF PASSIVE ATTACKS ==
The first is that TLS exists in a stupendously vast number of
externally visible profiles. If you're running a protocol that an
external censor would like to detect and block, it is not enough to
simply use your TLS library in a sensible way. Instead, you need to
pick a popular application that the censor doesn't want to block, and
your clients need to emulate its clients, and your servers need to
emulate its servers.
I'll go through the history of the stupid ways that Tor has tried to
use TLS over the years. This will read like a comedy of errors;
please don't judge our missteps too harshly.
=== Stage 1: (Tor versions 0.0.1 through 0.2.0.19) ===
What we would really
like to do, if we didn't need to worry about censorship, is to have a
profile where the client declared a list of ciphersuites using strong
ciphers and some kind of ephemeral key agreement mechanism. The server
should always send a two-certificate chain, with a longer-term server
identity key signing a shorter-term link key. The link key should
change on the order of hours to days. The client should either send
no certificates, or a similar two-certificate chain, depending on
whether it itself is a Tor server or not. We know that our data is
all encrypted, and therefore uncompressible, so we don't want to use
compression. We have no need for vhost support.
In our X.509 certificates, we really don't need to use any fields
besides the public key, the serial number (for correctness), and the
validAfter/validUntil fields.
In our earliest versions, we just set the cipher list to what we
wanted to use, and disabled the extensions we weren't using. We did
ephemeral key agreeement using the 1024-bit DH group specified in RFC
RFC2409, section 6.2.
We made some concessions to avoiding fingerprinting during this phase. For example, we removed some fixed strings from our certificates' DNs.
=== Stage 2: (Tor 0.2.0.20 through 0.2.3.6) ===
Our particular use of two-certificate chains, our unusual cipher list,
and our our funny-looking certs made Tor pretty easy to profile.
So we switched to an approach where we would begin by sending a list
of ciphers hacked to match the list sent by Firefox, a server would
reply with a single certificate, and then the client would immediately
renegotiate. Upon the renegotiation, the server and client would pick
the ciphersuites they really wanted to use, and exchange their real
certificates.
To avoid the most trivial certificate-based attacks, we began
generating bogus domain names (e.g., www.asjdioyzmohodf.net) and
sticking them in the commonName part of the certificates.
Because we still needed to support old clients, we had the server
detect whether the client was sending a new-style cipher list, and if
so, reply using a single cert and expect renegotiation. If the client
sent the old cipher list, we'd have to send a two-certs chain and not
expect renegotiation.
In retrospect, this approach was ill-conceived for a few reasons.
Having the server send different certificates depending on the
contents of the ClientHello was hard to implement. (To match firefox,
we also needed to feign support for the bogus nonstandard 0xfeff
SSL3_RSA_FIPS_WITH_3DES_EDE_CBC_SHA, which was hard to do.)
Furthermore, it is trivial to detect a renegotiation by looking at
record headers [*], and our renegotiations were apparently unusual
enough to provoke interest.
When Marsh Ray's attack against renegotiation came out in 2009, and everybody who could possibly turn
off renegotiation did so, our use of renegotiation stood out even
more, especially since we had to keep doing it even when built with
versions of OpenSSL that didn't support RFC 5746. [**]
In 2011, Iran blocked Tor based on our choice of Diffie-Hellman
parameters. In version 0.2.2.22-alpha, we switched to copy the fixed
DH parameters from Apache's mod_ssl, since we figured those were
probably in more popular use.
[*] At this point, we hadn't actually learned very much about TLS
internals: we were treating TLS as an idealized black-block encrypted
transport. Obviously, this was a mistake on our part. We honestly
hadn't imagined that the protocol would send record types and sizes in
the clear, since that's so obviously an unnecessary information leak.
When we found out about the renegotiation issues, when the TLS
renegotiation MITM bug hit in 2009, and when the BEAST attack hit, we
realized we needed to know a lot more about TLS internals to actually
use it effectively.
[**] Doing so is safe for us:
https://lists.torproject.org/pipermail/tor-talk/2009-November/015864.html
Protocol details at
https://gitweb.torproject.org/torspec.git/tree/proposals/130-v2-conn-protocol.txt
=== Stage 3: (Tor 0.2.3.6 to present) ===
When we started getting detected and blocked based on our use of
renegotiation, we switched to an improved handshake, where the outer
TLS handshake is meant to look as simple as possible (one cert of
arbitrary type, no renegotiation, any ciphersuite with ephemeral keys
is allowed). To get the authentication we need, we do a completely
ugly hack and authenticate _within_ the Tor protocol, by exchanging a
signature of a MAC of some of the TLS connection parameters, and
passing the certs we actually want. This approach allows us to have
our original TLS handshake contain (nearly) anything we want it to,
and still get the authentication we would actually prefer.
This duplicates TLS functionality, and we wouldn't need to be doing it
if we didn't need to blend in with other TLS handshakes.
Protocol details at
https://gitweb.torproject.org/torspec.git/tree/proposals/176-revising-handshake.txt
=== Stage 4: ===
There are more features that are getting used to distinguish Tor from
other TLS traffic. They include:
* Our cipher list matches that of a somewhat out-of-date ff.
* We don't send the same extensions in the same order as NSS does.
* Although our protocol provides support for arbitrary certificates, the software doesn't yet allow you to drop in (say) a CA-signed certificate to present on the wire.
We're hoping to address these in 0.2.4.x.
== REASON 2: IT IS HARD TO RESIST ACTIVE PROBING ==
Most censorious attackers are passive, and seem to be using modified
commercial software to detect and block Tor connections. On the other
hand, the most sophisticated attacker (i.e, the Great Firewall) is
using a hybrid passive/active approach where they notice suspicious
TLS connections and then launch connections themselves to see whether
there's a Tor node on the other side.
To resist this, we want to move to a circumstance where the client
needs to know some secret or other kind of authenticator in order to
complete the handshake ... or to even confirm that it isn't talking to
a regular webserver. The presentation of this information can't be
visible to a passive observer.
== SOME OBVIOUS AND NOT-SO-OBVIOUS POINTS: ==
There are, as near as I can tell, two main classes of censorious
adversary: the GFW, and everybody else. Everybody else seems either
to rely on commercial censorware, or on home-brewed censorware that's
no better than the commercial censorware.
The GFW maintainers seem to genuinely want to block Tor whenever it
gets too easy to use to circumvent the firewall, or whenever there's a
big politically significant event coming up.
The commercial censorware providers, on the other hand, seem to only
care about blocking Tor when their customers complain, and only to the
minimal extent necessary. In many cases, they have chosen the
simplest-to-change protocol signature that they could possibly have
chosen. This pattern of decisions might mean that their engineers
disapprove of blocking Tor. More likely, it means that they are
keeping their customers on an upgrade treadmill, to ensure that no
single version of the software will block next year's version of Tor.
Even likelier than that, their engineers are just being lazy, and
adopting the simplest solution they can find so they can close the
"Tor works" bug and get on to their next task.
We have had remarkable luck adding a trivial superencipherment layer
to the TLS stream. Even though this layer could be trivially removed
by any adversary who knows what to look for, and has an AES
implementation, it's still working okay.
In Tor, every client starts out (or could be made to start out)
knowing the public key of the server it wants to connect to. This
might be good for something.
We know we're in an arms race. If we get rid of all the protocol
signatures expressible in terms of looking for bytes on the
connection, we'll still have to care about TCP packet boundaries and
timing attacks.
In the long run, to avoid certain performance problems, we are
considering moving server<->server communications to a UDP-based
protocol, and using TLS-over-TCP only for client<->server
communications, and in cases where it's necessary for client . Any
such change is, however, a number of years out.