Combine traffic obfuscation with address diversity of flash proxy

added SponsorF20131101 in Legacy / Trac component::circumvention/pluggable transport in Legacy / Trac flashproxy in Legacy / Trac fog in Legacy / Trac owner::asn in Legacy / Trac priority::medium in Legacy / Trac resolution::fixed in Legacy / Trac status::closed in Legacy / Trac type::project in Legacy / Trac labels

Trac:
Keywords: SponsorZ deleted, SponsorF20131101 added

After discussions in the dev meeting, consensus is:

a) First, investigate the option of simply using the code of obfs2 in flashproxy. This involves librarifying obfs2 in some way, and/or implementing flashproxy within the pyobfsproxy framework. This should be the easiest way of doing this ticket.

b) Then, investigate SOCKS proxy chaining like it was initially suggested in this ticket. This involves implementing legacy/trac#8402 (moved), specing out a way to do the chaining (Tor should pass the chain to the PT proxy somehow), specing out a way to collect statistics on chained transports, and implementing the whole thing. This is more painful. There are also other problems, like distinguishing possible transport combinations from impossible ones (e.g. obfs2|websocket (doable and useful) from websocket|obfs2 (doesn't make much sense))

c) Finally, we should start researching a multilayer pluggable transport theory, in the same way that regular network protocols are implemented. This means that content-shaping should be on the bottom (obfs2, obfs3, packet-length shapers, etc.), appearance-shaping should be above (like an HTTP transport, a base64 transport, etc.) and a transport layer should be on top (like flashproxy, skypemorph, webrtc, regular socket, etc.). This is more research-y, but if done correctly it might allow greater reuse of transports, and will cause an explosion on the number of pluggable transports we have.

Trac:
Keywords: SponsorF20131101 deleted, SponsorF20131101 flashproxy added

Replying to asn:

After discussions in the dev meeting, consensus is:

a) First, investigate the option of simply using the code of obfs2 in flashproxy. This involves librarifying obfs2 in some way, and/or implementing flashproxy within the pyobfsproxy framework. This should be the easiest way of doing this ticket.

My guess is that it's easiest to port the flashproxy code to pyobfsproxy/Twisted. Here is the main event loop, which could be rewritten with Twisted: https://gitweb.torproject.org/flashproxy.git/blob/23699fdfeb59473d69bea3306a150c7e743e824c:/flashproxy-client#l877

The WebSocket implementation is a couple of hundred lines and can probably be directly copied: https://gitweb.torproject.org/flashproxy.git/blob/23699fdfeb59473d69bea3306a150c7e743e824c:/flashproxy-client#l203

The HTTP WebSocket request handler might be a little weird, because it has to hijack an HTTP channel and turn it into a normal socket. But maybe Twisted doesn't make that hard. https://gitweb.torproject.org/flashproxy.git/blob/23699fdfeb59473d69bea3306a150c7e743e824c:/flashproxy-client#l507

Trac:
Cc: dcf, nickm, asn to dcf, nickm, asn, griffinboyce@gmail.com

I started looking into integrating flashproxy within obfsproxy. Unfortunately, the obfsproxy architecture is quite conservative (pure client/server model, client opens SOCKS server, client listens on upstream, server listens on downstream, focuses on traffic obfuscation), whereas flashproxy is much more avant-garde (both client and server listens on downstream, focuses on IP address obfuscation not on traffic obfuscation, etc.). This means that integrating flashproxy in obfsproxy will require a big refactoring, which will not necessarily be helpful in the future when integrating other PTs. For example, looking at legacy/trac#7153 (moved) it seems that the suggested changes are hand-tailored for specific PTs, without having big reusable value.

When I communicated my thoughts to David, we brainstormed on something between ideas b) and c) of comment:2. That is, maybe we shouldn't try to integrate both PTs into one, but we should find a nice way to use them both at the same time. For example, from the PoV of obfsproxy, flashproxy should just be a magic socket that transports the bytes to the correct location. And from the PoV of flashproxy, obfsproxy should just be a magic obfuscator that encodes/decodes bytes. That is, we are moving to a more layered approach, which I think is a good way to think of and develop network protocols.

Some things to think about: a) How should we pass bytes between obfsproxy and flashproxy? Should it be SOCKS chaining, or just pipe bytes from one program to another?

b) Who should manage these two programs? In the original post of this ticket, it was suggested that Tor would manage those two programs. Another approach would be to have obfsproxy spawn and manage flashproxy. Without thinking much about it, I think the latter approach is also worth considering, since managing subprocesses in Python is probably easier than in C, and furthermore Tor would never need to learn about this (obfsproxy would just expose a new transport name (e.g. obfsflashproxy) that would do the obfsproxy+flashproxy combination internally.)

Replying to asn:

When I communicated my thoughts to David, we brainstormed on something between ideas b) and c) of comment:2. That is, maybe we shouldn't try to integrate both PTs into one, but we should find a nice way to use them both at the same time. For example, from the PoV of obfsproxy, flashproxy should just be a magic socket that transports the bytes to the correct location. And from the PoV of flashproxy, obfsproxy should just be a magic obfuscator that encodes/decodes bytes. That is, we are moving to a more layered approach, which I think is a good way to think of and develop network protocols.

Some things to think about: a) How should we pass bytes between obfsproxy and flashproxy? Should it be SOCKS chaining, or just pipe bytes from one program to another?

SOCKS chaining is probably where we need to go in the future. Perhaps easier is to strip out the SOCKS handler from flashproxy-client and configure obfsproxy to connect directly to flashproxy-client's LOCAL port. (See the flashproxy-client man page; LOCAL normally receives SOCKS from Tor; REMOTE receives WebSocket from the Internet.)

flashproxy-client --external --no-socks --register :11111 :9000
obfsproxy obfs3 socks 127.0.0.1:22222

Then configure Tor to connect to the client transport at 127.0.0.1:22222 and make a SOCKS request for 127.0.0.1:11111.

b) Who should manage these two programs? In the original post of this ticket, it was suggested that Tor would manage those two programs. Another approach would be to have obfsproxy spawn and manage flashproxy. Without thinking much about it, I think the latter approach is also worth considering, since managing subprocesses in Python is probably easier than in C, and furthermore Tor would never need to learn about this (obfsproxy would just expose a new transport name (e.g. obfsflashproxy) that would do the obfsproxy+flashproxy combination internally.)

I think we want a third program (say obfs-flash-proxy) that controls both flashproxy and obfsproxy parts. Tor will call obfs-flash-proxy as if that program were implementing the entire transport itself. obfs-flash-proxy will fork the other two programs and connect them however necessary.

Bridge obfs-flash 0.0.1.0:1
ClientTransportPlugin obfs-flash exec ./obfs-flash-proxy

As for how to provide command-line options to the flashproxy and obfsproxy parts, well, hmm, I dunno yet.

Also, for the server transport, we have no SOCKS or anything, and so have to do something different. Maybe ExtORPort chaining? Or give up on the extended ORPort and just do ORPort chaining of server transports, which will be really easy.

This is obviously not fully general, but in the interest of meeting deadlines perhaps it shouldn't be. It is simple enough that it will clearly work, and it gives us something to compare a more general framework against in the future. "Do the simplest thing that could possibly work" etc.

We had a discussion about this project with David and Ximin. We decided that we will probably build a super-proxy that chains PTs together.

So the topology for the client-side will look like this:

|tor client| -> ( |obfsproxy| -> |flashproxy| ) -> |INTERNET|

where the parentheses is the super-proxy.

The topology on the server-side is similar:

|internet| -> ( |flashproxy| -> |obfsproxy| ) -> |Extended ORPort|

We've been having some problems with the server-side, since |obfsproxy| is no longer aware of the client's IP address, and can't report it using the Extended ORPort.

This means that the super-proxy is probably going to do the Extended ORPort handshake, since it's the entity who knows the IP address of the client (it also knows the name of the combined pluggable transport used).

Thinking about it like this:

|internet| -> |super_1| -> |flashproxy| -> |obfsproxy| -> |super_2| -> |Extended ORPort|

the question now becomes:

We have a few options. Unfortunately, all of them seem ugly.

a) Make both |flashproxy| and |obfsproxy| Extended ORPort listeners, and chain up Extended ORPort connections.

b) Implement yet-another-metadata protocol for this use case.

c) Have |super_2| bind to different local ports to denote different clients.

d) Put client IPs in a queue inside |super| and when |super_2| connects to the Extended ORPort pop the queue and report that IP. Of course, this doesn't guarantee that client IPs will match with the correct data :) Tor doesn't care about this currently, but it might in the future.

The above solutions are hacky or/and cheap or/and evil. What other solutions exist?

Replying to asn:

We have a few options. Unfortunately, all of them seem ugly.

Just to keep things exciting: remember that we may eventually want to implement rate limiting in pluggable transports -- that is, to share Tor's overall bandwidth buckets between Tor and the network-facing transports. So this isn't just about statistics reporting.

I am new here so am still trying to form a mental model to reason about this system. Here's what I have so far, let me know if it's correct and/or useful.

For preciseness, I am going to state, explicitly, the IN-interface and OUT-interface of each component. (This will also help with any general plugin-composition framework later on.) For example, this is what a typical normal Tor channel, without transport plugins, looks like:

where A=X means a channel A using protocol X, and X(U) means something that wraps U. A could be e.g. a inet socket, or a file descriptor, or some combination of these. To chain stuff correctly, we must ensure the OUT-interface of one component matches the IN-interface of the next component. (Technically we should write Z=SOCKS(U), D=U for the first,last components, but since we're not using Z,D anywhere I'll just leave them out.)

With a general transport plugin, the diagram from the client to the server is:

For obfsproxy:

For flashproxy:

For obfs-flash-proxy, then, an incomplete "first attempt" using naiving layering:

So we have two problems here:

how to implement ?? - we need a component that reads input from an OUT-interface of {SOCKS(X),control(D)}, and writes output to an IN-interface of {X}.
as written, using this naive layering architecture, we have no way of connecting the control(D) channel to something. what we really want to do is turn the T=control(R) into a T=control(D), since the former is an implementation detail and the latter is the user-originated data that we need to actually control.

I'll need some more time working out the precise details, but a quick read gives me the impression that the super_1/super_2 solution proposed above solves these two problems by re-architecting the OU that passes through the |flash-s| component into something like OU+D.

Does this make sense, am I correct, and/or did I go overboard with the notation? :p

So using my notation, the super_1/super_2 architecture becomes:

internet

I should be less hasty claiming "proofs", there are like a million ways of doing this. My diagram above was actually different from all the solutions in asn's previous post; here are the corresponding diagrams for each. Importantly, we always need a shim between flashproxy and obfsproxy, simply because their input/output interfaces are incompatible (one is SOCKS, the other isn't) - I've called it super_m.

"a) Make both |flashproxy| and |obfsproxy| Extended ORPort listeners,"

(Actually, |flash-s| doesn't need to be a listener, in this case.)

"b) Implement yet-another-metadata protocol for this use case."

"c) Have |super_2| bind to different local ports to denote different clients."

This is just a variation of (b) where super_1 can somehow (outside of the diagram) dynamically change the port that |obfs-s| connects to.

"d) Put client IPs in a queue inside |super| and when |super_2| connects to the Extended ORPort pop the queue and report that IP. "

Sounds like incorrect behaviour to me...?

Solution (a) seems the cleanest and re-usable, because of the way the general transport plugin architecture works. If we are to have arbitrary composable plugins, the easiest way is to require them to be an Extended ORPort listener, and passthrough data appropriately. This could be added to a library to make things easier for developers.

Trac:
Cc: dcf, nickm, asn, griffinboyce@gmail.com to dcf, nickm, asn, griffinboyce@gmail.com, karsten

Some notes on comment:8 above, at asn's request.

I agree that (a) seems best of the listed. But it measns that every proxy must accept an extended orport form of itself, which is a bit annoying. It's not so differrent from (b), which also requires that every proxy support a new protocol: the only difference here is that the protocol is one we already designed.

(c) is probably easiest, but needs a bunch of ports if the clients are slow to connect. Also, wouldn't you need to make the client use separate target ports for each incoming connection, to tell them apart? I don't believe there's a way to do that with the current protocol without launching a bunch of clients.

(b) seems like it might be a good idea, but it will be the hardest. It might be something to think about in the future, but if you're going to do it, do it in some way that avoids using sockets needlessly -- maybe by allowing the plugins to do this kind of communication over a preconfigured pipe, or an in-memory API, or something. But that's harder, and could wait IMO.

I'm also between (a) and (b). (c) seems a bit hacky and might be painful to reproduce bugs, debug, etc.

For both (a) and (b) we will probably have to add another mode (let's call it super_server for now) to our transports, so that their server-side listener acts like an Extended ORPort (or another metadata protocol). We will also have to change the pt-spec so that we can suggest that mode to our transports using the managed-proxy configuration protocol.

I suspect that a light variant of (b) might be easier to implement than (a): |internet| -> |super_1| -> |flashproxy| -> |obfsproxy| -> |super_2| -> |Extended ORPort| For example, for every connection that super_1 receives, it prepends <ip>:<super-transport>:DELIMITERDELIMITERDELIMITERDELIMITER to the traffic flow and passes it on to flashproxy. Then flashproxy and obfsproxy, when in super_server mode, are instructed to ignore the first chunk of data of the connection (up until DELIMITERDELIMITERDELIMITERDELIMITER). When super_2 receives that first chunk, it parses it, gets the relevant information and removes it from the traffic flow. It then does an Extended ORPort client connection to Tor.

The above idea is not entirely elegant but it might be easier to implement than (a).

(a) is more future-proof but might be messy when the Extended ORPort authentication part comes into the game. We will have to find a place to store the Extended ORPort auth cookies, and also find a name format that will allow the other transport proxies to find the correct cookie for each Extended ORPort.

Summary from IRC today: we decided to try to implement (d) first because it's the simplest, even though it's only correct when certain assumptions hold, about how Tor actually uses the metadata/headers from the ORPort protocol.

Additionally, I have another variant which I was trying to communicate on IRC earlier, but I'm not sure if I did a good job of it. I don't think it's much more complex than (d), and is fully correct without relying on the above assumptions. It mixes ideas from (a) and (c), using different ports to distinguish between clients, in order to avoid both using a new custom protocol (as in (b)), or requiring obfsproxy to be modified to accept ORPort (as in (a)).

To recap, for (c) we need to somehow "tell" obfsproxy to send its output to different ports. One way to achieve this is to start a new instance of obfsproxy for each client connection, telling it to listen on a different port. To avoid having to start a new instance of flashproxy too, we move the |super_1| component after it. So the diagram looks like this:

|flashproxy| -> |super_1| -> |obfsproxy| -> |super_2| -> |Tor|

super_1 reads ORPort from flashproxy, stores all headers (COMMANDS using terminology of spec 196) and associates this info with some random port P
super_1 tells super_2 to listen on port P (can do this in-process)
super_1 opens a new instance of obfsproxy, with its ORPort set to P.
super_1 sends BODY to obfsproxy
obfsproxy processes this and sends ORPort output to super_2, on port P.
super_2 reads ORPort from obfsproxy on port P, looks up the headers associated with P and sends these to Tor
super_2 skips the headers from the incoming channel (since they are from obfsproxy and not flashproxy), then forwards BODYLEN/BODY directly onto Tor

In summary, super_1 receives extended ORPort protocol, and outputs raw data; super_2 receives and outputs extended ORPort protocol.

Advantages:

super_1/super_2 do not need to know what flashproxy/obfsproxy do - (except, super_1 needs to know how to start up obfsproxy, but any super-proxy must know this info)
flashproxy/obfsproxy do not need to be changed

Disadvantages:

a new obfsproxy process is created for each client connection.

Things not considered:

ORPort authentication
TransportControlPort

I wrote a simple implementation of obfs3-in-flashproxy. It's just two shell scripts tying the various programs together. It is even simpler than [comment:8 option (d)] above--it doesn't handle ExtORPort. But it works and boostraps--I was using IRC through it today.

git clone https://www.bamsoftware.com/git/obfs-flash.git

The scripts are less than 100 lines together, which I hope shows that at least the basic mechanics of chaining transports aren't that difficult. I call this implementation "Mk. I," with the idea that Mk. II will be (d) and Mk. III will be one that correctly handles the ordering of streams that connect back to the super-proxy.

The server component obfs-flash-server is running at tor1.bamsoftware.com:9500 (173.255.221.44:9500). To run the client part, use the included torrc:

tor -f torrc

Wait a few seconds, then open a non-Tor browser running a local flash proxy: http://crypto.stanford.edu/flashproxy/embed.html?debug&client=127.0.0.1:9000&relay=173.255.221.44:9500 Remember to wait 60 seconds after 50% bootstrapping. You don't need to do any port forwarding.

One thing we didn't think about was the flash proxy facilitator. The reason you have to run your own flash proxy in the instructions above is that the facilitator doesn't know that it needs to give a obfs3-in-websocket relay to proxies that are serving certain clients. Obviously connecting to the plain websocket relay that the facilitator uses now, won't work. We need to think of a way to allow a client to say that it wants to connect to a different kind of relay. Like client registration messages, which currently only contain the client IP and port, could also contain the nested protocols the client wants to use. The facilitator would check to see if it knows of any relays having that nesting, and reply to proxies with an appropriate relay. I have some previous thinking along these lines in comment:2:ticket:5578, comment:5:ticket:7944.

The next easiest step, I think, is to do a Python reimplementation of the client side. It will do proper parsing (and error reporting) of PT stdout output, and remove the need for an external socat program.

Trac:
Status: new to needs_review

Nice! I like your MK. I incremental development idea.

infinity0, your idea is reasonable too, but I fear that spawning an obfsproxy for every new client might prove to be too kludgy in the future. Also, using different TCP ports for signalling seems hard to debug (and might be racey too, if not designed correctly).

I personally think that developing d), to have a system that works, and then planning to move to a) or b) in the future might be a reasonable plan.

Replying to dcf:

One thing we didn't think about was the flash proxy facilitator. The reason you have to run your own flash proxy in the instructions above is that the facilitator doesn't know that it needs to give a obfs3-in-websocket relay to proxies that are serving certain clients. Obviously connecting to the plain websocket relay that the facilitator uses now, won't work. We need to think of a way to allow a client to say that it wants to connect to a different kind of relay. Like client registration messages, which currently only contain the client IP and port, could also contain the nested protocols the client wants to use. The facilitator would check to see if it knows of any relays having that nesting, and reply to proxies with an appropriate relay. I have some previous thinking along these lines in comment:2:ticket:5578, comment:5:ticket:7944.

Opened legacy/trac#9349 (closed) for this task.

Trac:
Cc: dcf, nickm, asn, griffinboyce@gmail.com, karsten to dcf, nickm, asn, griffinboyce@gmail.com, karsten, arlo@torproject.org

I made a Python version of the server at https://www.bamsoftware.com/git/obfs-flash.git, directory mki.v. This is now what's running at tor1.bamsoftware.com:9500 (173.255.221.44:9500). The instructions from comment:17 will still work.

This isn't quite Mk. II because it's not handling ExtORPort yet. But it is a bit nicer than the shell script proof of concept.

Could we have a Python implementation of the client? The nice thing is that the client won't need any further iterations: A Mk. I.V client will also work for Mk. II and Mk. III in the plan we have developed. The server and ExtORPort is the only tricky part.

I will do the Python client when I have time if nobody beats me to it.

I like what's happening in legacy/trac#9376 (moved), but let's not let completely correct subprocess handling hold us back from having running code. Better to have a version with dirty SIGINT handling running now, and improve it in a later iteration.

dcf don't repeat code! I already did a python client a week ago:

https://github.com/infinity0/obfs-flash/commit/41b760

legacy/trac#9376 (moved) is me moving the subprocess-handling code from that commit into a common library.

Replying to infinity0:

dcf don't repeat code! I already did a python client a week ago:

https://github.com/infinity0/obfs-flash/commit/41b760

Very good, thanks. I merged your changes into my repo.

These are my recommendations as to the next development steps for the client:

Make it use pyptlib. This will replace all the TOR_PT_MANAGED_TRANSPORT_VER and TOR_PT_CLIENT_TRANSPORTS stuff. Unfortunately it won't replace the CMETHOD read loop; you still have to do that by hand.
Implement the socat SOCKS part in Python. The program needs to implement a TCP listener and a SOCKS4 (don't do SOCKS5) client. This might be implemented as a Python thread running an accept loop, with new TCP connections being split off as Python threads.

Trac:
Status: needs_review to needs_revision

Regarding https://github.com/infinity0/obfs-flash/commit/f490c7ee0cdf6ac8527c26bbe742e7304e9dc8b7, I think we may have to clear the environment in general in the subprocesses, or at least make sure to clear variables like TOR_PT_EXTENDED_SERVER_PORT. We are forcing the subproxies into non-ExtORPort mode because we don't want them connecting to the ExtORPort that is known to the super-proxy. In obfs-flash-server mki.v, I put PATH in the environment manually and it was enough to get path searching to work.

FWIW here is the Extended ORPOrt client code of obfsproxy: https://gitweb.torproject.org/pluggable-transports/obfsproxy.git/blob/HEAD:/obfsproxy/network/extended_orport.py

It's based on Twisted so the networking logic might look different, but the general idea should be the same.

I started Mk. II of the server transport in my repo at https://www.bamsoftware.com/git/obfs-flash.git. To build it, do

export GOPATH=$(pwd)
go get # fetches the pt library
make

This version is pretty much a port of the Python Mk. I.V. The next step is to have it open its own listener and connect to Tor using the ExtORPort.

This is now what's running at tor1.bamsoftware.com:9500 (173.255.221.44:9500).

Update from my side: I have updated the client to use pyptlib and twisted and txsocksx. There are a few more things to tidy up, such as a better CLI, and async error handling using Deferreds; otherwise it is more-or-less production-complete.

I could also factor out the CMETHOD read loop into the pyptlib library, as a twisted Protocol, since other plugins might want to use it.

Howdy,

looking at the logic of obfs-flash.git, it seems that if we want a bundle that contains websocket, obfs2|websocket and obfs3|websocket, then we will have to spawn 3 flashproxy-clients and each one of them will do its own registration.

That's fine, but each of those flashproxy-clients should be aware of the transport chain it supports, so that it can submit a proper client registraton to the facilitator.

How should we do this? Should we add a CLI swich to flashproxy-client that looks like this: --transport=websocket|obfs3? Should we also be able to specify a default port for that transport so that we can ask people to forward a specific TCP port (like we currently do with DEFAULT_REMOTE_PORT)?

Replying to infinity0:

Update from my side: I have updated the client to use pyptlib and twisted and txsocksx. There are a few more things to tidy up, such as a better CLI, and async error handling using Deferreds; otherwise it is more-or-less production-complete.

Thanks, I have merged your changes. For what it's worth, it's not running for me, using pyptlib 39d2a4fe from 17 August.

obfs-flash/mkii$ ./obfs-flash-client 
Traceback (most recent call last):
  File "./obfs-flash-client", line 8, in <module>
    from pyptlib.client import ClientTransportPlugin
ImportError: cannot import name ClientTransportPlugin

I could also factor out the CMETHOD read loop into the pyptlib library, as a twisted Protocol, since other plugins might want to use it.

My preference is not to move it to pyptlib (assume YAGNI). If it turns out later to be a good idea to move the functionality into a library, let's let it have been prototyped in a shipping application first.

Replying to asn:

looking at the logic of obfs-flash.git, it seems that if we want a bundle that contains websocket, obfs2|websocket and obfs3|websocket, then we will have to spawn 3 flashproxy-clients and each one of them will do its own registration.

That's fine, but each of those flashproxy-clients should be aware of the transport chain it supports, so that it can submit a proper client registraton to the facilitator.

How should we do this? Should we add a CLI swich to flashproxy-client that looks like this: --transport=websocket|obfs3? Should we also be able to specify a default port for that transport so that we can ask people to forward a specific TCP port (like we currently do with DEFAULT_REMOTE_PORT)?

Yeah, I like the --transport option. Although in this case I think it would be --transport=obfs3|websocket (opposite order to match earlier discussion). The default transport would be websocket. You have to add the option to all the registration helpers too (flashproxy-reg-email etc.). See build_register_command in flashproxy-client.

Knowledge of default ports should be built into the super-proxy, not flashproxy-client, I think. What I am imagining is a configuration file that shows what commands to run for the sub-proxies. flashproxy-client takes its remote port from the command line like this. The super-proxy will inform flashproxy-client of its remote port, and then flashproxy-client forwards that port to the registration helpers.

If legacy/trac#5426 (closed) were done, then the super-proxy could do registration, rather than flashproxy-client doing it.

Replying to dcf:

Thanks, I have merged your changes. For what it's worth, it's not running for me, using pyptlib 39d2a4fe from 17 August.

Try from my github - I'm waiting for legacy/trac#9485 (moved) to be merged.

My preference is not to move it to pyptlib (assume YAGNI). If it turns out later to be a good idea to move the functionality into a library, let's let it have been prototyped in a shipping application first.

Sure, I'll hold this off for now.

I should be able to finalise the client side by next week and make a start on legacy/trac#9349 (closed).

Replying to dcf:

I started Mk. II of the server transport in my repo at https://www.bamsoftware.com/git/obfs-flash.git. To build it, do

export GOPATH=$(pwd)
go get # fetches the pt library
make
}}}
This version is pretty much a port of the Python Mk. I.V. The next step is to have it open its own listener and connect to Tor using the ExtORPort.

This is now what's running at tor1.bamsoftware.com:9500 (173.255.221.44:9500).

I pushed a rough draft of Mk. II that does extended ORPort. The program opens an external and an internal listener in addition to starting the two subproxies. The external listener listens on the bindaddr and forwards data to the proxy chain. The proxy chain is configured to finally connect to the internal listener. The internal listener forwards data to the (extended) ORPort. A FIFO queue (implemented as a Go channel) stores the external connecting IP addresses so they can be given to the extended ORPort by the internal listener.

It seems to do the extended USERADDR thing right, at least when testing locally: {{{ Aug 29 19:45:09.000 [debug] connection_ext_or_process_inbuf(): Got Extended ORPort data. Aug 29 19:45:09.000 [debug] connection_ext_or_handle_cmd_useraddr(): Received USERADDR.We rewrite our address from '[scrubbed]:35967' to '[s crubbed]:38715'. Aug 29 19:45:09.000 [debug] connection_ext_or_process_inbuf(): Got Extended ORPort data. Aug 29 19:45:09.000 [debug] connection_ext_or_process_inbuf(): Got Extended ORPort data. Aug 29 19:45:09.000 [debug] connection_ext_or_process_inbuf(): Received DONE.


I want to polish and refactor the code yet.

What I am most concerned about is the scenario when something connects to the external listener (and an address is queued), but something goes wrong in the proxy chain (for example the WebSocket handshake is bad) and there's no following connection to the internal listener. We will have an orphan address in the queue that will be associated with the next connection that survives to the internal listener. In the current implementation, I supposed this will eventually result in deadlock, because the size of the FIFO is limited. Perhaps we can age and expire queue entries.

Replying to dcf:

Replying to dcf:
I started Mk. II of the server transport in my repo at https://www.bamsoftware.com/git/obfs-flash.git. To build it, do
export GOPATH=$(pwd)
go get # fetches the pt library
make
}}}
This version is pretty much a port of the Python Mk. I.V. The next step is to have it open its own listener and connect to Tor using the ExtORPort.

This is now what's running at tor1.bamsoftware.com:9500 (173.255.221.44:9500).
I pushed a rough draft of Mk. II that does extended ORPort.

Cool, the nested transports appear in bridge-stats: {{{ bridge-ip-transports =8,obfs3_flash=8,websocket=168

I've added a basic CLI, and also the nifty feature of auto-selecting the local listen port for flashproxy so the user doesn't have to specify it. Available on my github repo.

At the moment, one still needs to manually pick a port for the 'Bridge obfs3_flash' line in torrc, which in theory should not be necessary since it's a local port that's only used locally, but I don't know of a practical way around this. (I don't know Tor well enough.)

I would say that the client side is finished, for now at least (since there are other things to do). It's ready to be tested in The Real World.

Replying to infinity0:

Replying to dcf:

Thanks, I have merged your changes. For what it's worth, it's not running for me, using pyptlib 39d2a4fe from 17 August.

Try from my github - I'm waiting for legacy/trac#9485 (moved) to be merged.

Maybe it shouldn't bother me so much, but it's a bit inconvenient to have this ticket wait for a pyptlib API change. I haven't tried the code yet because it doesn't work with the pyptlib in Debian. Unless the new pyptlib API is very close to being merged, is there a chance we could have the client use the old API? For it to get testing, I think depending on an unmerged API change is a real impediment.

Replying to dcf:

Maybe it shouldn't bother me so much, but it's a bit inconvenient to have this ticket wait for a pyptlib API change. I haven't tried the code yet because it doesn't work with the pyptlib in Debian. Unless the new pyptlib API is very close to being merged, is there a chance we could have the client use the old API? For it to get testing, I think depending on an unmerged API change is a real impediment.

It's close to being merged, at least most likely before we're going to distribute this, so I'd rather not do extra backporting work that will be eventually-pointless. I'll re-evaluate in a few weeks if it's still not merged.

If you don't want to bother with my git repo, I've made a .deb of the new pyptlib here:

https://people.torproject.org/~infinity0/bin/python-pyptlib_0.0.3-1.1_all.deb.gpg

It's backwards-compatible so it shouldn't break anything else on your system. (There is no "deb source package" since we haven't distributed the new pyptlib yet, to be able to make a ".orig" tarball.)

Replying to infinity0:

I've added a basic CLI, and also the nifty feature of auto-selecting the local listen port for flashproxy so the user doesn't have to specify it. Available on my github repo.

Thanks, I merged it.

At the moment, one still needs to manually pick a port for the 'Bridge obfs3_flash' line in torrc, which in theory should not be necessary since it's a local port that's only used locally, but I don't know of a practical way around this. (I don't know Tor well enough.)

I think the way to do this is to put an extra SOCKS listener in front of obfsproxy, that eats Tor's SOCKS request and rewrites it into one pointing at the local SOCKS shim that points to flashproxy-client. But leaving in a hardcoded port number seems fine for now.

Replying to dcf:

What I am most concerned about is the scenario when something connects to the external listener (and an address is queued), but something goes wrong in the proxy chain (for example the WebSocket handshake is bad) and there's no following connection to the internal listener. We will have an orphan address in the queue that will be associated with the next connection that survives to the internal listener. In the current implementation, I supposed this will eventually result in deadlock, because the size of the FIFO is limited. Perhaps we can age and expire queue entries.

I did tests today and it is indeed trivial to wedge the server just by making 10 non-WebSocket connections to it. Here is what you see in the case of a normal obfs3|websocket connection:

2013-09-10 18:46:30 external connection from [scrubbed].
2013-09-10 18:46:30 handleExternalConnection: now 1 conns buffered.
2013-09-10 18:46:31 internal connection from 127.0.0.1:35389.
2013-09-10 18:46:31 connecting to ORPort using remote addr [scrubbed].
2013-09-10 18:46:31 handleInternalConnection: now 0 conns buffered.

Notice how an external connection leads to an internal connection that reduces the numbers of conns buffered. Compare this to what happens if you just connect with netcat 12 times:

2013-09-13 15:49:39 external connection from [scrubbed].
2013-09-13 15:49:39 handleExternalConnection: now 1 conns buffered.
2013-09-13 15:49:53 external connection from [scrubbed].
2013-09-13 15:49:53 handleExternalConnection: now 2 conns buffered.
2013-09-13 15:52:00 external connection from [scrubbed].
2013-09-13 15:52:00 handleExternalConnection: now 3 conns buffered.
2013-09-13 15:52:03 external connection from [scrubbed].
2013-09-13 15:52:03 handleExternalConnection: now 4 conns buffered.
2013-09-13 15:52:04 external connection from [scrubbed].
2013-09-13 15:52:04 handleExternalConnection: now 5 conns buffered.
2013-09-13 15:52:06 external connection from [scrubbed].
2013-09-13 15:52:06 handleExternalConnection: now 6 conns buffered.
2013-09-13 15:52:08 external connection from [scrubbed].
2013-09-13 15:52:08 handleExternalConnection: now 7 conns buffered.
2013-09-13 15:52:10 external connection from [scrubbed].
2013-09-13 15:52:10 handleExternalConnection: now 8 conns buffered.
2013-09-13 15:52:12 external connection from [scrubbed].
2013-09-13 15:52:12 handleExternalConnection: now 9 conns buffered.
2013-09-13 15:52:14 external connection from [scrubbed].
2013-09-13 15:52:14 handleExternalConnection: now 10 conns buffered.
2013-09-13 15:52:18 external connection from [scrubbed].
2013-09-13 15:52:25 external connection from [scrubbed].

What I propose to do about this: Let's push conns on a stack, rather than a queue. When a new external connection comes in, it is put on top of the stack. When a new internal connection comes in, it is assigned the most recently seen address, the one at the top of the stack. Zombie connections that somehow didn't survive the proxy chain to make an internal connection will remain at the bottom of the stack, and never assigned to any ExtORPort connection (and we can just prune them if they get too old).

Using a stack rather than a queue means that we are virtually certain to invert the order of two near-simultaneous external connections. But we weren't worried about that anyway. It means that we will quickly forget about addresses that connected to our proxy chain but didn't result in an ExtORPort connection.

I believe this will be good enough to prevent incorrect ExtORPort addresses, assuming that parties are honest or naive, and not malicious. That is, it will prevent assigning the address of someone who happened to just port scan the server, except for a narrow race window. (You would have to do your port scan or netcat after a legitimate external connection, but before that connection gets all the way through the proxy chain.) It's still vulnerable to malicious actors, say someone who makes constant non-WebSocket connections in an effort to poison the metrics data. But I'm not sure we can do better until Mk. III, when we do something to always correctly match up ExtORPort information.

Replying to dcf:

What I propose to do about this: Let's push conns on a stack, rather than a queue. When a new external connection comes in, it is put on top of the stack. When a new internal connection comes in, it is assigned the most recently seen address, the one at the top of the stack. Zombie connections that somehow didn't survive the proxy chain to make an internal connection will remain at the bottom of the stack, and never assigned to any ExtORPort connection (and we can just prune them if they get too old).

I implemented the stack idea and am now running it on the relay. See comment:17 for how to see the code and run a client against it.

I did some review and polishing of the server code and I think it's ready for beta-level testing now.

I think we should move the bamsoftware.com repository to git.torproject.org and call it something other than obfs-flash. I created legacy/trac#9743 (moved) for that.

Yesterday and with the help of dcf, I managed to use obfs-flash (using instructions from comment:17) to bootstrap a Tor client (using the facilitator from comment:40:ticket:9349). Some notes from my adventures:

txscocksx is a dependency for obfs-flash. We should mention that in a README or something, and add it to a setup.py maybe.
I was getting:

Traceback (most recent call last):
  File "./obfs-flash-client", line 270, in <module>
    sys.exit(main(*sys.argv[1:]))
  File "./obfs-flash-client", line 265, in main
    obfs3_flash(reactor, client, opts.obfs_out, opts.fp_remote)
  File "./obfs-flash-client", line 217, in obfs3_flash
    ["websocket"], [fp_client, '127.0.0.1:%s' % fp_local, fp_remote])
  File "./obfs-flash-client", line 189, in pt_launch_child
    StandardIO(sub_protocol, stdin=sub_proc.stdout.fileno(), reactor=reactor)
TypeError: __init__() got an unexpected keyword argument 'reactor'

Not sure why. I just removed the , reactor=reactor part.

If the reactor was never started and then obfs-flash-client fails, you get errors like this:

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)
  File "/usr/local/lib/python2.7/dist-packages/pyptlib-0.0.5-py2.7.egg/pyptlib/util/subproc.py", line 184, in killall
    cleanup()
  File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 577, in stop
    "Can't stop reactor that isn't running.")
ReactorNotRunning: Can't stop reactor that isn't running.
Error in sys.exitfunc:
Traceback (most recent call last):
  File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
    func(*targs, **kargs)
  File "/usr/local/lib/python2.7/dist-packages/pyptlib-0.0.5-py2.7.egg/pyptlib/util/subproc.py", line 184, in killall
    cleanup()
  File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 577, in stop
    "Can't stop reactor that isn't running.")
twisted.internet.error.ReactorNotRunning: Can't stop reactor that isn't running.

Apart from that, the whole thing worked nicely!

I added a commit for obfs-flash-client to give --transport and arbitrary arguments to flashproxy-client - you can customise the torrc based on the example given, and obfs-flash-client will now do the registration automatically!

https://github.com/infinity0/obfs-flash/commit/7c030dd305976767f1c2b52dc6f40048a4c66802

edit: requires some of the stuff from my bug9349_server_urlparam branch

I've found a problem with the current set-up. I'm not sure why it's happening, but it's something special about our plugin composition.

If I disconnect my browser proxy and reconnect it (by reloading the page), with flashproxy-client tor is able to recover and I can continue browsing. However, with obfs-flash-client, tor is unable to do this and I get a bunch of error messages like this:

Oct 11 19:32:59.000 [notice] We tried for 15 seconds to connect to '[scrubbed]' using exit X1 at A1. Retrying on a new circuit.
[x4]
Oct 11 19:33:14.000 [notice] We tried for 15 seconds to connect to '[scrubbed]' using exit X2 at A2. Retrying on a new circuit.
[x4]
Oct 11 19:33:29.000 [notice] We tried for 15 seconds to connect to '[scrubbed]' using exit X3 at A3. Retrying on a new circuit.
[x4]

Would either of you know why this is happening? To reproduce:

torrc line for flashproxy-client:

ClientTransportPlugin websocket exec ./flashproxy-client --register -f http://siteb.fp-facilitator.org/fac/ --register-methods http :0

torrc line for obfs-flash-client (requires latest master from my github):

ClientTransportPlugin obfs3_flash exec ./obfs-flash-client --fp-arg=-f --fp-arg=http://siteb.fp-facilitator.org/fac/ --fp-arg=--register-methods=http 2334

browser proxy:

http://siteb.fp-facilitator.org/proxy/embed.html?debug&unsafe_logging=true

edit: It does seem to recover after several more minutes though.

After talking with Ximin, I changed the transport method names used by the client and server.

Client "obfs3_flash" changes to "obfs3_flashproxy".
Server "obfs3_flash" changes to "obfs3_websocket". This is to match what we do without the obfs layer, with "flashproxy" on the client and "websocket" on the server. It also looks good in that it looks like we just pasted together actual existing transport names with "_".

Closing as this is basically complete.

Will be deployed in a PTTBB bundle soon; see legacy/trac#10006 (moved).

I filed legacy/trac#10242 (closed) separately for the connection issue from comment:42.

Never mind, I can't close this because child tickets are still open. I'll just note that this is complete except for the specific issues in the child tickets.