Make an HTTP requestor Firefox extension for meek-client

added component::circumvention/meek meek owner::dcf parent::10935 priority::medium resolution::fixed status::closed type::project labels

Here is some early prototype code that simply makes a request when you start the browser.

git clone -b extension https://git.torproject.org/pluggable-transports/meek.git

gitweb is here: https://gitweb.torproject.org/pluggable-transports/meek.git/shortlog/refs/heads/extension

It (d966f770d89ff850efa319d6f9d1f4327e00a89c) works in Iceweasel 24.3, but not in Tor Browser from the 3.5.2.1 bundle. More details at https://lists.torproject.org/pipermail/tor-dev/2014-March/006441.html.

Trac:
Status: new to assigned
Owner: asn to dcf

Trac:
Cc: N/A to gk

Trac:
Cc: gk to gk, mcs, brade

We'll need to make sure that user cookies for www.google.com are not sent over the non-Tor helper. Are there other leaks to consider besides cookies? I found these articles:

https://developer.mozilla.org/en/docs/Creating_Sandboxed_HTTP_Connections
https://bugzilla.mozilla.org/show_bug.cgi?id=313414 According to the Mozilla bug, we can set an "http-on-modify-request" listener or similar in order to ensure at runtime that no cookies are being sent.

In this post I reported that I had a prototype browser extension that worked in Iceweasel but not in Tor Browser. Mark discovered that the connection was throwing NS_ERROR_UNKNOWN_PROXY_HOST (0x804B002A). Mike traced the cause to this patch that is specific to Tor Browser:

https://gitweb.torproject.org/tor-browser.git/commit/?id=5069a3ee8fa51546a8ad582e6004be66bc9748aa Specifically, here in nsDNSService::AsyncResolve is where the error is returned. If I comment out the error return, the extension works in Tor Browser just like in Iceweasel. That is, it does DNS and and HTTPS requests for www.google.com outside of the proxy, just as intended.

The 5069a3ee Tor Browser patch has a reason for existing, though, so we shouldn't simply undo it. It's meant to guard against unexpected DNS leaks in Firefox and extensions. I've thought of two potential ways to deal with the situation:

Make a special API or key that allows DNS lookups by a "direct" type proxy, which still prohibiting it from all other callers. Maybe the key is mere use of the "direct" type; maybe it's a magic string in the host field, or something like that.
Run a second copy of Firefox solely for making meek HTTP requests. The second browser would have network.proxy.socks_remote_dns=false, which setting is enough to disable the Tor Browser patch that breaks name lookups.

My current rough design for the extension is this. Open an nsIServerSocket on a well-known port. Connections to the port will send a JSON blob describing an HTTP request to make, looking like

{
  "method": "POST",
  "url": "https://www.google.com/",
  "header": {
    "Host": "meek-reflect.appspot.com",
    "X-Session-ID": "12345",
  },
  "body": "base64data",
}

The extension parses the JSON and makes the request using nsIHttpChannel. The extension reads the response and sends it out as JSON on the original connection it received on the well-known port, like

{
  "status": 200,
  "header": { ... },
  "body": "base64data",
}

Rationale for design decisions:

Why a listening socket for communication between meek-client and the extension, rather than some other IPC? I don't know enough IPC to know of another way that will work.
Why a well-known port instead of an ephemeral port? I don't know how the extension can listen on an ephemeral port and then tell meek-client what port to connect to. Maybe it would be easier to do if we ran a second browser and the second browser was a child process of the meek-client process.
Why JSON instead of an HTTP proxy (meek-client is already able to talk to an HTTP proxy)? The little reason is that being an HTTP proxy would require us to parse HTTP in the extension. I don't know if there's a standard library for that and I don't want another implementation of HTTP. JSON.parse and JSON.stringify work fine for data serialization (json.Marshal and json.Unmarshal in golang). The big reason is that HTTPS through an HTTP proxy uses the CONNECT method which merely tunnels the client's own TLS handshake.
Why nsIHttpChannel and not nsISocketTransportService.createTransport? I tested nsISocketTransportService, and its TLS handshake doesn't have the next_protocol_negotation=http/1.1 extension that nsIHttpChannel and normal browser requests have.
Why use a separate meek-client program to talk to the extension, rather than tor talking to the extension directly (with the extension implementing the PT protocol)? meek-client already exists and works fine apart from its TLS signature. I don't know extension programming so well, and would prefer to have less code in the extension than more. It seems more difficult to run a browser as a managed transport. I would like to treat the browser-like HTTP service as a mostly independent layer. For example we may want to try plugging in a different browser than Firefox. (Firefox is the most compelling option only because it's already included in the browser bundle.)

Trac:
Don-t-prohibit-name-lookups-with-socks_remote_dns-tr.patch

Patch to re-enable extra-proxy DNS lookups in Tor Browser.

Replying to dcf:

The 5069a3ee Tor Browser patch has a reason for existing, though, so we shouldn't simply undo it. It's meant to guard against unexpected DNS leaks in Firefox and extensions. I've thought of two potential ways to deal with the situation:

Make a special API or key that allows DNS lookups by a "direct" type proxy, which still prohibiting it from all other callers. Maybe the key is mere use of the "direct" type; maybe it's a magic string in the host field, or something like that.

Run a second copy of Firefox solely for making meek HTTP requests. The second browser would have network.proxy.socks_remote_dns=false, which setting is enough to disable the Tor Browser patch that breaks name lookups.

There is a third option on the horizon for bundles shipping a Tor Browser based on ESR 31: Mozilla fixed the WebSocket DNS leak (https://bugzilla.mozilla.org/show_bug.cgi?id=751465) which caused the defense-in-depth AND there will probably be a way to write tests that detect DNS leaks (https://bugzilla.mozilla.org/show_bug.cgi?id=971153). Thus, we could think about dropping the current patch that prevents your original idea from working while not throwing the defense-in-depth we currently have away for nothing.

A potential problem: Tor Browser's TLS ClientHello differs slightly from Firefox's. Tor Browser doesn't send the "SessionTicket TLS" extension. I think it's on account of #4099 (closed).

This is what Iceweasel 24.3.0 sends that Tor Browser 3.5.2.1 doesn't:

             Extension: ec_point_formats
                 Type: ec_point_formats (0x000b)
                 Length: 2
                 EC point formats Length: 1
                 Elliptic curves point formats (1)
                     EC point format: uncompressed (0)
+            Extension: SessionTicket TLS
+                Type: SessionTicket TLS (0x0023)
+                Length: 0
+                Data (0 bytes)
             Extension: next_protocol_negotiation
                 Type: next_protocol_negotiation (0x3374)
                 Length: 0

This only affects us if we run the extension in Tor Browser itself (but that was the plan).

I have some code for you to try out. The whole pipeline is working, more or less. (I'm typing this comment through browser-camouflaged meek.) At this point, you have to run the extension in a separate Firefox because of comment:6.

\

git clone -b extension https://git.torproject.org/pluggable-transports/meek.git
cd meek/meek-client
export GOPATH=~/go
go get
go build

In your separate Firefox's extensions directory, create a file called meek-http-helper@bamsoftware.com whose contents are the directory containing the extension (plus a trailing slash). For me, it is \

/home/david/meek/firefox/

Start the separate Firefox. You might have to active the extension in the Addons menu.
Create a torrc file with the contents (you can edit the torrc that's in the meek-client directory) \

UseBridges 1
Bridge meek 0.0.2.0:1
ClientTransportPlugin meek exec ./meek-client --url=https://meek-reflect.appspot.com/ --front=www.google.com --helper 127.0.0.1:7000 --log meek-client.log

`--helper` is the new special option here. Port 7000 where the extension is listening.

\

tor -f torrc

The comment at the top of firefox/components/main.js explains what's going on.

// This is an extension that allows external programs to make HTTP requests
// using the browser's networking libraries.
//
// The extension opens a TCP socket listening on localhost (port 7000). When it
// receives a connection, it reads a 4-byte big-endian length field, then tries
// to read that many bytes of data. The data is UTF-8–encoded JSON, having the
// format
//  {
//      "method": "POST",
//      "url": "https://www.google.com/",
//      "header": {
//          "Host": "meek-reflect.appspot.com",
//          "X-Session-Id": "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"}
//      }
//  }
// The extension makes the request as commanded. It returns the response to the
// client as a JSON blob, preceded by a 4-byte length as before. If successful,
// the response looks like
//  {
//      "status": 200,
//      "body": "...base64..."
//  }
// If there is a network error, the "error" key will be defined. A 404 response
// or similar from the target web server is not considered such an error.
//  {
//      "error": "NS_ERROR_UNKNOWN_HOST"
//  }
// The extension closes the connection after each transaction, and the client
// must reconnect to do another request.

My plan next is to try to make a bundle that uses Don-t-prohibit-name-lookups-with-socks_remote_dns-tr.patch, and see if all these steps can be automated.

Comments on the code and design continue to be welcome. One thing to do is to make the listening port configurable as a pref instead of being hardcoded.

Replying to dcf:

A potential problem: Tor Browser's TLS ClientHello differs slightly from Firefox's. Tor Browser doesn't send the "SessionTicket TLS" extension. I think it's on account of #4099 (closed).

Setting the pref security.enable_tls_session_tickets=true is enough to make the "SessionTicket TLS" extension appear. I'm not sure what other side effects it may have. (See comment:6:ticket:4099 ff.)

Still holding out hope that we can avoid shipping a second browser binary, my thinking now is that we should run the meek HTTP helper in a separate instance of Tor Browser, under a separate profile. The separate profile will have at least the following configuration changes:

network.proxy.socks_remote_dns=false
security.enable_tls_session_tickets=true The second instance should be headless so there's no chance of a user interacting with it directly. (Perhaps the HTTP helper itself could enforce headlessness.)

A separate process and profile means that we can be freer in changing settings that might compromise the security of Tor Browser. The separate profile will have its own history and cookies. The second profile won't be used for browsing: it is strictly an HTTPS driver under the control of a pluggable transport.

Here are bundles that use an extension within Tor Browser to make HTTP requests. The extension listens on localhost port 7000. meek-client connects to the extension on that port and commands it what HTTP requests to make.

These bundles have a TLS signature that is almost like Firefox's, only missing the "SessionTicket TLS" extension as noted in comment:9. As explained in comment:11, my next planned iteration is to do bundles where the extension runs in a separate browser instance with a separate profile. That will take care of the "SessionTicket TLS" and socks_remote_dns issues, and will make it possible not to run the extension when you're not using meek.

I had some trouble with these bundles on OS X initially. Bootstrapping would die after about 5 minutes at 50% (so it's not #9229 (moved)) with an "EOF" error in the meek-client log. But after investigating it for a while, I couldn't reproduce it.

#11393 (moved) is like this ticket, but for Chrome.

Trac:
Component: Pluggable transport to meek

Here are bundles that use an extension in a separate instance of Firefox. The second instance sets network.proxy.socks_remote_dns=false so that no patch for DNS lookups is needed in Tor Browser, and sets security.enable_tls_session_tickets=true in order to send the session ticket TLS extension. This version has the extension listen on an ephemeral port, which is written to the browser's stdout and read by the transport plugin.

The TLS signature of this bundle matches Firefox's, in what I have tested so far. A diff between client hellos is just

             Length: 165
             Version: TLS 1.0 (0x0301)
             Random
-                gmt_unix_time: Jul 12, 2089 08:23:06.000000000 PDT
-                random_bytes: f0b149a04ac4a554c5bda57030b17342cc1c0ab59c925cc8...
+                gmt_unix_time: Oct 23, 2081 13:09:42.000000000 PDT
+                random_bytes: 1608e4e50bbc5fb188ab87211ce29f35622d117a4829ebb2...
             Session ID Length: 0
             Cipher Suites Length: 70
             Cipher Suites (35 suites)

When you start the browser, it's immediately going to open a dialog box. The dialog is actually the sub-instance of Firefox running the meek-http-helper extension. Don't close the dialog or it will shut down the extension. The modal dialog prevents a browser window from being shown, and the extension kills the whole program when the dialog is closed. We need to find a way to accomplish the same thing without showing a visible dialog. For now it's kind of nice in that it makes it easy to see if the sub-instance of Firefox is being killed properly, etc.

There's a known bug, which is that subprocesses don't get cleaned up on Windows. In particular, meek-client and the second Firefox keep running when you close the main browser. I think it's because of #9330 (moved)--the program that starts meek-client and Firefox gets killed by ProcessTerminate without being able to notify its children. I have an idea for dealing with that that I'll try in the next round of bundles.

Trac:

3.5.2.1-meek-5 fixes the problem of child processes not being cleaned up on Windows (see comment:5:ticket:10047).

I merged the extension branch to master. Now we need to find out how to run the second browser instance without a dialog (and [ticket:11429 without opening a second dock icon] if possible).

Replying to dcf:

Now we need to find out how to run the second browser instance without a dialog (and [ticket:11429 without opening a second dock icon] if possible).

Ximin points out the -chrome command-line option of Firefox that lets you define what UI is shown at first.

Replying to dcf:

Now we need to find out how to run the second browser instance without a dialog.

This idea of pumping the event loop seems to work.

The MDN page says: Warning: Spinning the event loop breaks run to completion semantics of JavaScript (it creates a nested event loop). This can cause code to reenter itself, which can result in broken functionality. This approach should be avoided whenever possible. I don't think it applies to us. We're already breaking extension best practices by hijacking control and never returning, because that's the point.

Trac:
Resolution: N/A to fixed
Status: assigned to closed

https://gitweb.torproject.org/user/dcf/tor-browser-bundle.git/blob/refs/heads/meek:/Bundle-Data/PTConfigs/meek-http-helper-user.js looks good. I am just wondering why you use the awkward dump() and do not hard-code a port on localhost to use...

Review of https://gitweb.torproject.org/pluggable-transports/meek.git/blob/HEAD:/firefox/component/main.js part 1

 4 // The extension opens a TCP socket listening on localhost (port 7000).

See previous comment: you seem to advocate hard-coding a listener port, no?

 50 Components.interfaces.nsIServerSocketListener,

Nit: there is no "," needed here as this is the last interfaces listed.

 56 return

Nit: a ";" is missing at the end.

 124 this.requestreader = null;

Nit: should be "this.requestReader = null;" see line 131 as well.

 130 readRequest: function(callback) {

Seems you can omit the |callback|, no? You are not using it in the readRequest method. Or did you plan to pass it to the RequestReader constructor?

I have to think a bit about your usage of nsISocketTransport and whether you are affected by the things we ran into in #9531 (moved). See the comments there (e.g. regarding OPEN_BLOCKING|OPEN_UNBUFFERED)

Replying to gk:

https://gitweb.torproject.org/user/dcf/tor-browser-bundle.git/blob/refs/heads/meek:/Bundle-Data/PTConfigs/meek-http-helper-user.js looks good. I am just wondering why you use the awkward dump() and do not hard-code a port on localhost to use...

We talked about this on IRC today. The ephemeral port is to avoid local port conflicts. The dump call alerts the controller program, meek-client-torbrowser, what port to connect on, and also signals when the port is open and listening and safe to connect to.

The dump call that writes the port number to stdout seems like an unusual use of the function, at least, so it's a candidate for replacement if we think of something better.

The way the extension reports its listening port is similar to how the pluggable transports protocol works. tor tells its transports to listen on 127.0.0.1:0, and the transports report what their actual listening port is to stdout.

Replying to gk:

Review of https://gitweb.torproject.org/pluggable-transports/meek.git/blob/HEAD:/firefox/component/main.js part 1

{{{ 4 // The extension opens a TCP socket listening on localhost (port 7000). }}} See previous comment: you seem to advocate hard-coding a listener port, no? {{{ 50 Components.interfaces.nsIServerSocketListener, }}} Nit: there is no "," needed here as this is the last interfaces listed. {{{ 56 return }}} Nit: a ";" is missing at the end. {{{ 124 this.requestreader = null; }}} Nit: should be "this.requestReader = null;" see line 131 as well. {{{ 130 readRequest: function(callback) { }}} Seems you can omit the |callback|, no? You are not using it in the readRequest method. Or did you plan to pass it to the RequestReader constructor?

I think these are all answered as of https://gitweb.torproject.org/pluggable-transports/meek.git/shortlog/eab44b4fbaf3aabc7077a06d92ee02ce61b57932. All but the trailing comma, as that's deliberate.

I have to think a bit about your usage of nsISocketTransport and whether you are affected by the things we ran into in #9531 (moved). See the comments there (e.g. regarding OPEN_BLOCKING|OPEN_UNBUFFERED)

Okay. I don't know what to make of that.

Replying to dcf:

Replying to gk:

https://gitweb.torproject.org/user/dcf/tor-browser-bundle.git/blob/refs/heads/meek:/Bundle-Data/PTConfigs/meek-http-helper-user.js looks good. I am just wondering why you use the awkward dump() and do not hard-code a port on localhost to use...

We talked about this on IRC today. The ephemeral port is to avoid local port conflicts. The dump call alerts the controller program, meek-client-torbrowser, what port to connect on, and also signals when the port is open and listening and safe to connect to.

The dump call that writes the port number to stdout seems like an unusual use of the function, at least, so it's a candidate for replacement if we think of something better.

I think a more natural way would be using nsIEnvironment for passing stuff around. At least it is much more common. Does something speak against this approach?

Replying to gk:

Replying to dcf:

Replying to gk:

https://gitweb.torproject.org/user/dcf/tor-browser-bundle.git/blob/refs/heads/meek:/Bundle-Data/PTConfigs/meek-http-helper-user.js looks good. I am just wondering why you use the awkward dump() and do not hard-code a port on localhost to use...

We talked about this on IRC today. The ephemeral port is to avoid local port conflicts. The dump call alerts the controller program, meek-client-torbrowser, what port to connect on, and also signals when the port is open and listening and safe to connect to.

The dump call that writes the port number to stdout seems like an unusual use of the function, at least, so it's a candidate for replacement if we think of something better.

I think a more natural way would be using nsIEnvironment for passing stuff around. At least it is much more common. Does something speak against this approach?

Would that then require the pluggable transport to be a child process of the sub-Firefox, instead of a sibling as it is now?

Replying to dcf:

Replying to gk:

Replying to dcf:

Replying to gk:

https://gitweb.torproject.org/user/dcf/tor-browser-bundle.git/blob/refs/heads/meek:/Bundle-Data/PTConfigs/meek-http-helper-user.js looks good. I am just wondering why you use the awkward dump() and do not hard-code a port on localhost to use...

We talked about this on IRC today. The ephemeral port is to avoid local port conflicts. The dump call alerts the controller program, meek-client-torbrowser, what port to connect on, and also signals when the port is open and listening and safe to connect to.

The dump call that writes the port number to stdout seems like an unusual use of the function, at least, so it's a candidate for replacement if we think of something better.

I think a more natural way would be using nsIEnvironment for passing stuff around. At least it is much more common. Does something speak against this approach?

Would that then require the pluggable transport to be a child process of the sub-Firefox, instead of a sibling as it is now?

Uhm... I hadn't thought about that. Probably, yes. But that seems to be a lot of work (if it works at all) for little gain.

Replying to dcf:

Replying to gk:

I have to think a bit about your usage of nsISocketTransport and whether you are affected by the things we ran into in #9531 (moved). See the comments there (e.g. regarding OPEN_BLOCKING|OPEN_UNBUFFERED)

Okay. I don't know what to make of that.

Sorry for the delay. Here comes the second part of the code review for https://gitweb.torproject.org/pluggable-transports/meek.git/blob/HEAD:/firefox/components/main.js. Basically, it looks good to me. Nice work! One nit is the trailing comma after

status: context.responseStatus

but that may be deliberate again (what is the rationale for this, btw?).

Then I was wondering about the length limit especially that of the response. What is the reasoning behind it? Doesn't that prohibit certain use cases like retrieving PDF files larger than 1000000 Bytes?

And, finally, there is no option to get OPEN_BLOCKING and OPEN_UNBUFFERED (see https://mxr.mozilla.org/mozilla-esr24/source/netwerk/base/src/nsSocketTransport2.cpp#1755) and the Necko folks won't implement that.

Replying to gk:

Replying to dcf:

Replying to gk:

I have to think a bit about your usage of nsISocketTransport and whether you are affected by the things we ran into in #9531 (moved). See the comments there (e.g. regarding OPEN_BLOCKING|OPEN_UNBUFFERED)

Okay. I don't know what to make of that.

Sorry for the delay. Here comes the second part of the code review for https://gitweb.torproject.org/pluggable-transports/meek.git/blob/HEAD:/firefox/components/main.js. Basically, it looks good to me. Nice work! One nit is the trailing comma after {{{ status: context.responseStatus }}} but that may be deliberate again (what is the rationale for this, btw?).

Thank you for the review.

The trailing comma is so that there's only a single line in the diff when you add or remove something from the literal. The trailing comma is part of the ArrayLiteral and ObjectLiteral syntax. It's a part of most other languages too: the K&R C book says "A list may end with a comma, a nicety for neat formatting." As I recall, there's a problem with trailing commas in Internet Explorer, but this file is for Firefox only so I don't worry about it.

Then I was wondering about the length limit especially that of the response. What is the reasoning behind it? Doesn't that prohibit certain use cases like retrieving PDF files larger than 1000000 Bytes?

That's a good question. The limit is per-request, and it's only a failsafe in case limits are not enforced in other parts of the code. The meek-client and meek-server programs actually use a smaller limit of 65536 bytes per request, but I tried to design the programs so they don't trust one another. If a malicious meek-client or meek-server sends an endless stream of data, the browser extension should fail fast with an error rather than buffer forever. Similarly meek-client tries to protect itself from a malicious browser extension (see maxHelperResponseLength in helper.go).

A Tor stream is multiplexed across multiple requests and responses. The meek-client and meek-server programs reconstruct the HTTP into continuous TCP sessions at both ends. So it doesn't matter too much how big each individual payload is (though there is a tradeoff between bandwidth and latency). I've downloaded files of hundreds of megabytes.

And, finally, there is no option to get OPEN_BLOCKING and OPEN_UNBUFFERED (see https://mxr.mozilla.org/mozilla-esr24/source/netwerk/base/src/nsSocketTransport2.cpp#1755) and the Necko folks won't implement that.

So if I understand this correctly, OPEN_BLOCKING|OPEN_UNBUFFERED is the same as just OPEN_BLOCKING? Should I change it to just OPEN_BLOCKING?

Replying to dcf:

Replying to gk:

And, finally, there is no option to get OPEN_BLOCKING and OPEN_UNBUFFERED (see https://mxr.mozilla.org/mozilla-esr24/source/netwerk/base/src/nsSocketTransport2.cpp#1755) and the Necko folks won't implement that.

So if I understand this correctly, OPEN_BLOCKING|OPEN_UNBUFFERED is the same as just OPEN_BLOCKING?

Yes, I think so.

Should I change it to just OPEN_BLOCKING?

Well, it would be less confusing/misleading at least.

closed

mentioned in issue #11393 (moved)

mentioned in issue #13160 (moved)

moved to tpo/anti-censorship/pluggable-transports/meek#11183 (closed)

mentioned in issue tpo/anti-censorship/pluggable-transports/meek#11393 (closed)

Make an HTTP requestor Firefox extension for meek-client

Child items 0

Activity