Gaba · 09d55581
--- a/projects/Tor/TLSHistory.md
+++ b/projects/Tor/TLSHistory.md
+[Nick is writing this document as a summary of Tor's TLS annoyances, for the consumption of people who are working on the TLS protocol and would like to help Tor. It's a draft; I don't promise that it's accurate.]
+For a few reasons, TLS as it stands is a difficult protocol for Tor to use.
+## REASON 1: IT IS HARD TO LOOK LIKE OTHER TLS INSTANCES FROM THE POV OF PASSIVE ATTACKS
+The first is that TLS exists in a stupendously vast number of
+externally visible profiles.  If you're running a protocol that an
+external censor would like to detect and block, it is not enough to
+simply use your TLS library in a sensible way.  Instead, you need to
+pick a popular application that the censor doesn't want to block, and
+your clients need to emulate its clients, and your servers need to
+emulate its servers.
+I'll go through the history of the stupid ways that Tor has tried to
+use TLS over the years.  This will read like a comedy of errors;
+please don't judge our missteps too harshly.
+### Stage 1: (Tor versions 0.0.1 through 0.2.0.19)
+What we would really
+like to do, if we didn't need to worry about censorship, is to have a
+profile where the client declared a list of ciphersuites using strong
+ciphers and some kind of ephemeral key agreement mechanism. The server
+should always send a two-certificate chain, with a longer-term server
+identity key signing a shorter-term link key.  The link key should
+change on the order of hours to days.  The client should either send
+no certificates, or a similar two-certificate chain, depending on
+whether it itself is a Tor server or not.  We know that our data is
+all encrypted, and therefore uncompressible, so we don't want to use
+compression.  We have no need for vhost support.
+In our X.509 certificates, we really don't need to use any fields
+besides the public key, the serial number (for correctness), and the
+validAfter/validUntil fields.
+In our earliest versions, we just set the cipher list to what we
+wanted to use, and disabled the extensions we weren't using.  We did
+ephemeral key agreeement using the 1024-bit DH group specified in RFC
+RFC2409, section 6.2.
+We made some concessions to avoiding fingerprinting during this phase. For example, we removed some fixed strings from our certificates' DNs.
+### Stage 2: (Tor 0.2.0.20 through 0.2.3.6)
+Our particular use of two-certificate chains, our unusual cipher list,
+and our our funny-looking certs made Tor pretty easy to profile.
+So we switched to an approach where we would begin by sending a list
+of ciphers hacked to match the list sent by Firefox, a server would
+reply with a single certificate, and then the client would immediately
+renegotiate.  Upon the renegotiation, the server and client would pick
+the ciphersuites they really wanted to use, and exchange their real
+certificates.
+To avoid the most trivial certificate-based attacks, we began
+generating bogus domain names (e.g., www.asjdioyzmohodf.net) and
+sticking them in the commonName part of the certificates.
+Because we still needed to support old clients, we had the server
+detect whether the client was sending a new-style cipher list, and if
+so, reply using a single cert and expect renegotiation.  If the client
+sent the old cipher list, we'd have to send a two-certs chain and not
+expect renegotiation.
+In retrospect, this approach was ill-conceived for a few reasons.
+Having the server send different certificates depending on the
+contents of the ClientHello was hard to implement.  (To match firefox,
+we also needed to feign support for the bogus nonstandard 0xfeff
+SSL3_RSA_FIPS_WITH_3DES_EDE_CBC_SHA, which was hard to do.)
+Furthermore, it is trivial to detect a renegotiation by looking at
+record headers [*], and our renegotiations were apparently unusual
+enough to provoke interest.
+When Marsh Ray's attack against renegotiation came out in 2009, and everybody who could possibly turn
+off renegotiation did so, our use of renegotiation stood out even
+more, especially since we had to keep doing it even when built with
+versions of OpenSSL that didn't support RFC 5746.  [**]
+In 2011, Iran blocked Tor based on our choice of Diffie-Hellman
+parameters.   In version 0.2.2.22-alpha, we switched to copy the fixed
+DH parameters from Apache's mod_ssl, since we figured those were
+probably in more popular use.
+[*] At this point, we hadn't actually learned very much about TLS
+internals: we were treating TLS as an idealized black-block encrypted
+transport.  Obviously, this was a mistake on our part. We honestly
+hadn't imagined that the protocol would send record types and sizes in
+the clear, since that's so obviously an unnecessary information leak.
+When we found out about the renegotiation issues, when the TLS
+renegotiation MITM bug hit in 2009, and when the BEAST attack hit, we
+realized we needed to know a lot more about TLS internals to actually
+use it effectively.
+[**] Doing so is safe for us:
+https://lists.torproject.org/pipermail/tor-talk/2009-November/015864.html
+Protocol details at
+https://gitweb.torproject.org/torspec.git/tree/proposals/130-v2-conn-protocol.txt
+### Stage 3: (Tor 0.2.3.6 to present)
+When we started getting detected and blocked based on our use of
+renegotiation, we switched to an improved handshake, where the outer
+TLS handshake is meant to look as simple as possible (one cert of
+arbitrary type, no renegotiation, any ciphersuite with ephemeral keys
+is allowed).  To get the authentication we need, we do a completely
+ugly hack and authenticate _within_ the Tor protocol, by exchanging a
+signature of a MAC of some of the TLS connection parameters, and
+passing the certs we actually want.  This approach allows us to have
+our original TLS handshake contain (nearly) anything we want it to,
+and still get the authentication we would actually prefer.
+This duplicates TLS functionality, and we wouldn't need to be doing it
+if we didn't need to blend in with other TLS handshakes.
+Protocol details at
+https://gitweb.torproject.org/torspec.git/tree/proposals/176-revising-handshake.txt
+### Stage 4:
+There are more features that are getting used to distinguish Tor from
+other TLS traffic.  They include:
+ * Our cipher list matches that of a somewhat out-of-date ff.
+ * We don't send the same extensions in the same order as NSS does.
+ * Although our protocol provides support for arbitrary certificates, the software doesn't yet allow you to drop in (say) a CA-signed certificate to present on the wire.
+We're hoping to address these in 0.2.4.x.
+## REASON 2: IT IS HARD TO RESIST ACTIVE PROBING
+Most censorious attackers are passive, and seem to be using modified
+commercial software to detect and block Tor connections.  On the other
+hand, the most sophisticated attacker (i.e, the Great Firewall) is
+using a hybrid passive/active approach where they notice suspicious
+TLS connections and then launch connections themselves to see whether
+there's a Tor node on the other side.
+To resist this, we want to move to a circumstance where the client
+needs to know some secret or other kind of authenticator in order to
+complete the handshake ... or to even confirm that it isn't talking to
+a regular webserver.  The presentation of this information can't be
+visible to a passive observer.
+## SOME OBVIOUS AND NOT-SO-OBVIOUS POINTS:
+There are, as near as I can tell, two main classes of censorious
+adversary: the GFW, and everybody else.  Everybody else seems either
+to rely on commercial censorware, or on home-brewed censorware that's
+no better than the commercial censorware.
+The GFW maintainers seem to genuinely want to block Tor whenever it
+gets too easy to use to circumvent the firewall, or whenever there's a
+big politically significant event coming up.
+The commercial censorware providers, on the other hand, seem to only
+care about blocking Tor when their customers complain, and only to the
+minimal extent necessary.  In many cases, they have chosen the
+simplest-to-change protocol signature that they could possibly have
+chosen.  This pattern of decisions might mean that their engineers
+disapprove of blocking Tor.  More likely, it means that they are
+keeping their customers on an upgrade treadmill, to ensure that no
+single version of the software will block next year's version of Tor.
+Even likelier than that, their engineers are just being lazy, and
+adopting the simplest solution they can find so they can close the
+"Tor works" bug and get on to their next task.
+We have had remarkable luck adding a trivial superencipherment layer
+to the TLS stream.  Even though this layer could be trivially removed
+by any adversary who knows what to look for, and has an AES
+implementation, it's still working okay.
+In Tor, every client starts out (or could be made to start out)
+knowing the public key of the server it wants to connect to.  This
+might be good for something.
+We know we're in an arms race.  If we get rid of all the protocol
+signatures expressible in terms of looking for bytes on the
+connection, we'll still have to care about TCP packet boundaries and
+timing attacks.
+In the long run, to avoid certain performance problems, we are
+considering moving server<->server communications to a UDP-based
+protocol, and using TLS-over-TCP only for client<->server
+communications, and in cases where it's necessary for client .  Any
+such change is, however, a number of years out.