(Quoting text written by David Fifield for this ticket description.)
Find out what current DPI capabilities are with respect to WebSocket, at least through product literature.
Find out what existing, popular, WebSocket applications are used (chat, video, games?) that will be collateral damage to block. Write a short report on 1) how common they are, and 2) what their traffic looks like.
Implement a transport with an obfs2 stream transported over WebSocket.
We can imagine a new "obfs2-in-websocket" transport, but it might be a better design to allow chaining of proxies that don't necessarily have to know about one another. So you might have something like this on the client:
ServerTransportPlugin websocket proxy 127.0.0.1:9901ServerTransportPlugin obfs2 exec /usr/local/bin/obfsproxy --managed# And then some new configuration to say that things received on# port 9901 need to be forwarded to the local obfsproxy port.# Port 9901 won't be able to be used for plain websocket# connections, and I guess this will have to be reflected in the# descriptor somewhere.
See what other obfuscation possibilities exist. I don't think that TLS-wrapped WebSockets work for us (http://archives.seul.org/or/talk/Oct-2012/msg00190.html), but I haven't thought about it exhaustively. Replacing WebSocket with HTTP requests (the flash proxy POSTs bodies to both the client and the relay, and receives response bodies) would likely work, and would allow fuller control of the payloads (whereas with WebSocket we cannot escape the WebSocket framing). We gave up on using Flash, but Flash sockets allow us to control exactly what goes on the wire, except for an initial cross-domain request.
Designs
Child items
...
Show closed items
Linked items
0
Link issues together to show that they're related.
Learn more.
After discussions in the dev meeting, consensus is:
a) First, investigate the option of simply using the code of obfs2 in flashproxy. This involves librarifying obfs2 in some way, and/or implementing flashproxy within the pyobfsproxy framework. This should be the easiest way of doing this ticket.
b) Then, investigate SOCKS proxy chaining like it was initially suggested in this ticket. This involves implementing legacy/trac#8402 (moved), specing out a way to do the chaining (Tor should pass the chain to the PT proxy somehow), specing out a way to collect statistics on chained transports, and implementing the whole thing. This is more painful. There are also other problems, like distinguishing possible transport combinations from impossible ones (e.g. obfs2|websocket (doable and useful) from websocket|obfs2 (doesn't make much sense))
c) Finally, we should start researching a multilayer pluggable transport theory, in the same way that regular network protocols are implemented. This means that content-shaping should be on the bottom (obfs2, obfs3, packet-length shapers, etc.), appearance-shaping should be above (like an HTTP transport, a base64 transport, etc.) and a transport layer should be on top (like flashproxy, skypemorph, webrtc, regular socket, etc.). This is more research-y, but if done correctly it might allow greater reuse of transports, and will cause an explosion on the number of pluggable transports we have.
After discussions in the dev meeting, consensus is:
a) First, investigate the option of simply using the code of obfs2 in flashproxy. This involves librarifying obfs2 in some way, and/or implementing flashproxy within the pyobfsproxy framework. This should be the easiest way of doing this ticket.
I started looking into integrating flashproxy within obfsproxy. Unfortunately, the obfsproxy architecture is quite conservative (pure client/server model, client opens SOCKS server, client listens on upstream, server listens on downstream, focuses on traffic obfuscation), whereas flashproxy is much more avant-garde (both client and server listens on downstream, focuses on IP address obfuscation not on traffic obfuscation, etc.).
This means that integrating flashproxy in obfsproxy will require a big refactoring, which will not necessarily be helpful in the future when integrating other PTs. For example, looking at legacy/trac#7153 (moved) it seems that the suggested changes are hand-tailored for specific PTs, without having big reusable value.
When I communicated my thoughts to David, we brainstormed on something between ideas b) and c) of comment:2. That is, maybe we shouldn't try to integrate both PTs into one, but we should find a nice way to use them both at the same time. For example, from the PoV of obfsproxy, flashproxy should just be a magic socket that transports the bytes to the correct location. And from the PoV of flashproxy, obfsproxy should just be a magic obfuscator that encodes/decodes bytes. That is, we are moving to a more layered approach, which I think is a good way to think of and develop network protocols.
Some things to think about:
a) How should we pass bytes between obfsproxy and flashproxy? Should it be SOCKS chaining, or just pipe bytes from one program to another?
b) Who should manage these two programs? In the original post of this ticket, it was suggested that Tor would manage those two programs. Another approach would be to have obfsproxy spawn and manage flashproxy. Without thinking much about it, I think the latter approach is also worth considering, since managing subprocesses in Python is probably easier than in C, and furthermore Tor would never need to learn about this (obfsproxy would just expose a new transport name (e.g. obfsflashproxy) that would do the obfsproxy+flashproxy combination internally.)
When I communicated my thoughts to David, we brainstormed on something between ideas b) and c) of comment:2. That is, maybe we shouldn't try to integrate both PTs into one, but we should find a nice way to use them both at the same time. For example, from the PoV of obfsproxy, flashproxy should just be a magic socket that transports the bytes to the correct location. And from the PoV of flashproxy, obfsproxy should just be a magic obfuscator that encodes/decodes bytes. That is, we are moving to a more layered approach, which I think is a good way to think of and develop network protocols.
Some things to think about:
a) How should we pass bytes between obfsproxy and flashproxy? Should it be SOCKS chaining, or just pipe bytes from one program to another?
SOCKS chaining is probably where we need to go in the future. Perhaps easier is to strip out the SOCKS handler from flashproxy-client and configure obfsproxy to connect directly to flashproxy-client's LOCAL port. (See the flashproxy-client man page; LOCAL normally receives SOCKS from Tor; REMOTE receives WebSocket from the Internet.)
Then configure Tor to connect to the client transport at 127.0.0.1:22222 and make a SOCKS request for 127.0.0.1:11111.
b) Who should manage these two programs? In the original post of this ticket, it was suggested that Tor would manage those two programs. Another approach would be to have obfsproxy spawn and manage flashproxy. Without thinking much about it, I think the latter approach is also worth considering, since managing subprocesses in Python is probably easier than in C, and furthermore Tor would never need to learn about this (obfsproxy would just expose a new transport name (e.g. obfsflashproxy) that would do the obfsproxy+flashproxy combination internally.)
I think we want a third program (say obfs-flash-proxy) that controls both flashproxy and obfsproxy parts. Tor will call obfs-flash-proxy as if that program were implementing the entire transport itself. obfs-flash-proxy will fork the other two programs and connect them however necessary.
As for how to provide command-line options to the flashproxy and obfsproxy parts, well, hmm, I dunno yet.
Also, for the server transport, we have no SOCKS or anything, and so have to do something different. Maybe ExtORPort chaining? Or give up on the extended ORPort and just do ORPort chaining of server transports, which will be really easy.
This is obviously not fully general, but in the interest of meeting deadlines perhaps it shouldn't be. It is simple enough that it will clearly work, and it gives us something to compare a more general framework against in the future. "Do the simplest thing that could possibly work" etc.
We've been having some problems with the server-side, since
|obfsproxy| is no longer aware of the client's IP address, and can't
report it using the Extended ORPort.
This means that the super-proxy is probably going to do the Extended
ORPort handshake, since it's the entity who knows the IP address of
the client (it also knows the name of the combined pluggable transport
used).
"How does |super_2| match up the connections it receives from
|obfsproxy| with the connections received in |super_1|?"
We have a few options. Unfortunately, all of them seem ugly.
a) Make both |flashproxy| and |obfsproxy| Extended ORPort listeners,
and chain up Extended ORPort connections.
b) Implement yet-another-metadata protocol for this use case.
So for example, |super_1| prepends the client IP to the traffic
flow (with a delimiter) and both |flashproxy| and |obfsproxy| pass
it on to |super_2| who does the Extended ORport connection with the
info.
c) Have |super_2| bind to different local ports to denote different clients.
So for example, when |super_1| receives connection X, it asks the
transport proxies to forward this connection to a specific port
number that |super_2| binds to. This way |super_2| knows that data
received to that port is from connection X. Different connections
use different ports.
d) Put client IPs in a queue inside |super| and when |super_2|
connects to the Extended ORPort pop the queue and report that
IP. Of course, this doesn't guarantee that client IPs will match
with the correct data :) Tor doesn't care about this currently, but
it might in the future.
The above solutions are hacky or/and cheap or/and evil.
What other solutions exist?
We have a few options. Unfortunately, all of them seem ugly.
Just to keep things exciting: remember that we may eventually want to implement rate limiting in pluggable transports -- that is, to share Tor's overall bandwidth buckets between Tor and the network-facing transports. So this isn't just about statistics reporting.
I am new here so am still trying to form a mental model to reason about this system. Here's what I have so far, let me know if it's correct and/or useful.
For preciseness, I am going to state, explicitly, the IN-interface and OUT-interface of each component. (This will also help with any general plugin-composition framework later on.) For example, this is what a typical normal Tor channel, without transport plugins, looks like:
where A=X means a channel A using protocol X, and X(U) means something that wraps U. A could be e.g. a inet socket, or a file descriptor, or some combination of these. To chain stuff correctly, we must ensure the OUT-interface of one component matches the IN-interface of the next component. (Technically we should write Z=SOCKS(U), D=U for the first,last components, but since we're not using Z,D anywhere I'll just leave them out.)
With a general transport plugin, the diagram from the client to the server is:
For obfs-flash-proxy, then, an incomplete "first attempt" using naiving layering:
:SOCKS(U)|tor-c|A=SOCKS(U): ->
:A=SOCKS(U)|obfs-c|B=obfs(U): ->
:B=OU|super|C=SOCKS(OU): -> # where OU == obfs(U)
:C=SOCKS(OU)|flash-c|D=flash(OU): ->
:D=flash(OU)|flash-s|P=SOCKS(OU),Q=control(D): ->
??
:R=obfs(U)|obfs-s|S=Tor(U),T=control(R): ->
:S=SOCKS(U),T=control(R)|tor-s|Tor(U):
So we have two problems here:
how to implement ?? - we need a component that reads input from an OUT-interface of {SOCKS(X),control(D)}, and writes output to an IN-interface of {X}.
as written, using this naive layering architecture, we have no way of connecting the control(D) channel to something. what we really want to do is turn the T=control(R) into a T=control(D), since the former is an implementation detail and the latter is the user-originated data that we need to actually control.
I'll need some more time working out the precise details, but a quick read gives me the impression that the super_1/super_2 solution proposed above solves these two problems by re-architecting the OU that passes through the |flash-s| component into something like OU+D.
Does this make sense, am I correct, and/or did I go overboard with the notation? :p
So using my notation, the super_1/super_2 architecture becomes:
:SOCKS(U)|tor-c|A=SOCKS(U): ->
:A=SOCKS(U)|obfs-c|B=obfs(U): ->
:B=OU|super|C=SOCKS(OU): -> # where OU == obfs(U)
:C=SOCKS(OU)|flash-c|D=flash(OU): ->
internet
:D=flash(OU)|super_1|E=flash(D+OU): ->
:E=flash(D+OU)|flash-s|P=SOCKS(D+OU),Q=control(E): ->
:P=SOCKS(D+OU),Q=control(E)|super_2|R=D+OU ->
:R=D+obfs(U)|modified obfs-s|S=SOCKS(U),T=control(D): -> # typo corrected from previous post
:S=SOCKS(U),T=control(R)|tor-s|Tor(U):
The nice thing about this is that we have a proof (I think :p) that |obfs-s| does need to be modified for the obfs-flash-proxy combo (at least if we go down this layering route of trying to re-use |obfs-s| and |flash-s|), since there is no other way of passing D through to the underlying |tor-s|.
I should be less hasty claiming "proofs", there are like a million ways of doing this. My diagram above was actually different from all the solutions in asn's previous post; here are the corresponding diagrams for each. Importantly, we always need a shim between flashproxy and obfsproxy, simply because their input/output interfaces are incompatible (one is SOCKS, the other isn't) - I've called it super_m.
"a) Make both |flashproxy| and |obfsproxy| Extended ORPort listeners,"
(Actually, |flash-s| doesn't need to be a listener, in this case.)
"b) Implement yet-another-metadata protocol for this use case."
:D=flash(OU)|super_1|E=flash(ODU): -> # where ODU == obfs(D+U), D+U being the "custom protocol". Note, this is slightly different from the one I drew above, but is similar in intent.
:E=flash(ODU)|flash-s|P=SOCKS(ODU),Q=control(E): ->
:P=SOCKS(ODU),Q|super_m|R=ODU: ->
:R=obfs(D+U)|obfs-s|S=SOCKS(D+U),T=control(R): ->
:S=SOCKS(D+U),T|super_2|S'=SOCKS(U),T'=control(D): ->
:S'=SOCKS(U),T'=control(D)|tor-s|Tor(U):
"c) Have |super_2| bind to different local ports to denote different clients."
This is just a variation of (b) where super_1 can somehow (outside of the
diagram) dynamically change the port that |obfs-s| connects to.
"d) Put client IPs in a queue inside |super| and when |super_2|
connects to the Extended ORPort pop the queue and report that
IP. "
Sounds like incorrect behaviour to me...?
Solution (a) seems the cleanest and re-usable, because of the way the general transport plugin architecture works. If we are to have arbitrary composable plugins, the easiest way is to require them to be an Extended ORPort listener, and passthrough data appropriately. This could be added to a library to make things easier for developers.
I agree that (a) seems best of the listed. But it measns that every proxy must accept an extended orport form of itself, which is a bit annoying. It's not so differrent from (b), which also requires that every proxy support a new protocol: the only difference here is that the protocol is one we already designed.
(c) is probably easiest, but needs a bunch of ports if the clients are slow to connect. Also, wouldn't you need to make the client use separate target ports for each incoming connection, to tell them apart? I don't believe there's a way to do that with the current protocol without launching a bunch of clients.
(b) seems like it might be a good idea, but it will be the hardest. It might be something to think about in the future, but if you're going to do it, do it in some way that avoids using sockets needlessly -- maybe by allowing the plugins to do this kind of communication over a preconfigured pipe, or an in-memory API, or something. But that's harder, and could wait IMO.
I'm also between (a) and (b). (c) seems a bit hacky and might be painful to reproduce bugs, debug, etc.
For both (a) and (b) we will probably have to add another mode (let's call it super_server for now) to our transports, so that their server-side listener acts like an Extended ORPort (or another metadata protocol). We will also have to change the pt-spec so that we can suggest that mode to our transports using the managed-proxy configuration protocol.
I suspect that a light variant of (b) might be easier to implement than (a):
|internet| -> |super_1| -> |flashproxy| -> |obfsproxy| -> |super_2| -> |Extended ORPort|
For example, for every connection that super_1 receives, it prepends <ip>:<super-transport>:DELIMITERDELIMITERDELIMITERDELIMITER to the traffic flow and passes it on to flashproxy. Then flashproxy and obfsproxy, when in super_server mode, are instructed to ignore the first chunk of data of the connection (up until DELIMITERDELIMITERDELIMITERDELIMITER). When super_2 receives that first chunk, it parses it, gets the relevant information and removes it from the traffic flow. It then does an Extended ORPort client connection to Tor.
The above idea is not entirely elegant but it might be easier to implement than (a).
(a) is more future-proof but might be messy when the Extended ORPort authentication part comes into the game. We will have to find a place to store the Extended ORPort auth cookies, and also find a name format that will allow the other transport proxies to find the correct cookie for each Extended ORPort.
Summary from IRC today: we decided to try to implement (d) first because it's the simplest, even though it's only correct when certain assumptions hold, about how Tor actually uses the metadata/headers from the ORPort protocol.
Additionally, I have another variant which I was trying to communicate on IRC earlier, but I'm not sure if I did a good job of it. I don't think it's much more complex than (d), and is fully correct without relying on the above assumptions. It mixes ideas from (a) and (c), using different ports to distinguish between clients, in order to avoid both using a new custom protocol (as in (b)), or requiring obfsproxy to be modified to accept ORPort (as in (a)).
To recap, for (c) we need to somehow "tell" obfsproxy to send its output to different ports. One way to achieve this is to start a new instance of obfsproxy for each client connection, telling it to listen on a different port. To avoid having to start a new instance of flashproxy too, we move the |super_1| component after it. So the diagram looks like this:
super_1 reads ORPort from flashproxy, stores all headers (COMMANDS using terminology of spec 196) and associates this info with some random port P
super_1 tells super_2 to listen on port P (can do this in-process)
super_1 opens a new instance of obfsproxy, with its ORPort set to P.
super_1 sends BODY to obfsproxy
obfsproxy processes this and sends ORPort output to super_2, on port P.
super_2 reads ORPort from obfsproxy on port P, looks up the headers associated with P and sends these to Tor
super_2 skips the headers from the incoming channel (since they are from obfsproxy and not flashproxy), then forwards BODYLEN/BODY directly onto Tor
In summary, super_1 receives extended ORPort protocol, and outputs raw data; super_2 receives and outputs extended ORPort protocol.
Advantages:
super_1/super_2 do not need to know what flashproxy/obfsproxy do - (except, super_1 needs to know how to start up obfsproxy, but any super-proxy must know this info)
flashproxy/obfsproxy do not need to be changed
Disadvantages:
a new obfsproxy process is created for each client connection.
I wrote a simple implementation of obfs3-in-flashproxy. It's just two shell scripts tying the various programs together. It is even simpler than [comment:8 option (d)] above--it doesn't handle ExtORPort. But it works and boostraps--I was using IRC through it today.
The scripts are less than 100 lines together, which I hope shows that at least the basic mechanics of chaining transports aren't that difficult. I call this implementation "Mk. I," with the idea that Mk. II will be (d) and Mk. III will be one that correctly handles the ordering of streams that connect back to the super-proxy.
The server component obfs-flash-server is running at tor1.bamsoftware.com:9500 (173.255.221.44:9500). To run the client part, use the included torrc:
One thing we didn't think about was the flash proxy facilitator. The reason you have to run your own flash proxy in the instructions above is that the facilitator doesn't know that it needs to give a obfs3-in-websocket relay to proxies that are serving certain clients. Obviously connecting to the plain websocket relay that the facilitator uses now, won't work. We need to think of a way to allow a client to say that it wants to connect to a different kind of relay. Like client registration messages, which currently only contain the client IP and port, could also contain the nested protocols the client wants to use. The facilitator would check to see if it knows of any relays having that nesting, and reply to proxies with an appropriate relay. I have some previous thinking along these lines in comment:2:ticket:5578, comment:5:ticket:7944.
The next easiest step, I think, is to do a Python reimplementation of the client side. It will do proper parsing (and error reporting) of PT stdout output, and remove the need for an external socat program.
Nice! I like your MK. I incremental development idea.
infinity0, your idea is reasonable too, but I fear that spawning an obfsproxy for every new client might prove to be too kludgy in the future. Also, using different TCP ports for signalling seems hard to debug (and might be racey too, if not designed correctly).
I personally think that developing d), to have a system that works, and then planning to move to a) or b) in the future might be a reasonable plan.
One thing we didn't think about was the flash proxy facilitator. The reason you have to run your own flash proxy in the instructions above is that the facilitator doesn't know that it needs to give a obfs3-in-websocket relay to proxies that are serving certain clients. Obviously connecting to the plain websocket relay that the facilitator uses now, won't work. We need to think of a way to allow a client to say that it wants to connect to a different kind of relay. Like client registration messages, which currently only contain the client IP and port, could also contain the nested protocols the client wants to use. The facilitator would check to see if it knows of any relays having that nesting, and reply to proxies with an appropriate relay. I have some previous thinking along these lines in comment:2:ticket:5578, comment:5:ticket:7944.
I made a Python version of the server at https://www.bamsoftware.com/git/obfs-flash.git, directory mki.v. This is now what's running at tor1.bamsoftware.com:9500 (173.255.221.44:9500). The instructions from comment:17 will still work.
This isn't quite Mk. II because it's not handling ExtORPort yet. But it is a bit nicer than the shell script proof of concept.
Could we have a Python implementation of the client? The nice thing is that the client won't need any further iterations: A Mk. I.V client will also work for Mk. II and Mk. III in the plan we have developed. The server and ExtORPort is the only tricky part.
I will do the Python client when I have time if nobody beats me to it.
I like what's happening in legacy/trac#9376 (moved), but let's not let completely correct subprocess handling hold us back from having running code. Better to have a version with dirty SIGINT handling running now, and improve it in a later iteration.
Very good, thanks. I merged your changes into my repo.
These are my recommendations as to the next development steps for the client:
Make it use pyptlib. This will replace all the TOR_PT_MANAGED_TRANSPORT_VER and TOR_PT_CLIENT_TRANSPORTS stuff. Unfortunately it won't replace the CMETHOD read loop; you still have to do that by hand.
Implement the socat SOCKS part in Python. The program needs to implement a TCP listener and a SOCKS4 (don't do SOCKS5) client. This might be implemented as a Python thread running an accept loop, with new TCP connections being split off as Python threads.
Regarding https://github.com/infinity0/obfs-flash/commit/f490c7ee0cdf6ac8527c26bbe742e7304e9dc8b7, I think we may have to clear the environment in general in the subprocesses, or at least make sure to clear variables like TOR_PT_EXTENDED_SERVER_PORT. We are forcing the subproxies into non-ExtORPort mode because we don't want them connecting to the ExtORPort that is known to the super-proxy. In obfs-flash-server mki.v, I put PATH in the environment manually and it was enough to get path searching to work.
Update from my side: I have updated the client to use pyptlib and twisted and txsocksx. There are a few more things to tidy up, such as a better CLI, and async error handling using Deferreds; otherwise it is more-or-less production-complete.
I could also factor out the CMETHOD read loop into the pyptlib library, as a twisted Protocol, since other plugins might want to use it.
looking at the logic of obfs-flash.git, it seems that if we want a bundle that contains websocket, obfs2|websocket and obfs3|websocket, then we will have to spawn 3 flashproxy-clients and each one of them will do its own registration.
That's fine, but each of those flashproxy-clients should be aware of the transport chain it supports, so that it can submit a proper client registraton to the facilitator.
How should we do this? Should we add a CLI swich to flashproxy-client that looks like this: --transport=websocket|obfs3? Should we also be able to specify a default port for that transport so that we can ask people to forward a specific TCP port (like we currently do with DEFAULT_REMOTE_PORT)?
Update from my side: I have updated the client to use pyptlib and twisted and txsocksx. There are a few more things to tidy up, such as a better CLI, and async error handling using Deferreds; otherwise it is more-or-less production-complete.
Thanks, I have merged your changes. For what it's worth, it's not running for me, using pyptlib 39d2a4fe from 17 August.
obfs-flash/mkii$ ./obfs-flash-client Traceback (most recent call last): File "./obfs-flash-client", line 8, in <module> from pyptlib.client import ClientTransportPluginImportError: cannot import name ClientTransportPlugin
I could also factor out the CMETHOD read loop into the pyptlib library, as a twisted Protocol, since other plugins might want to use it.
My preference is not to move it to pyptlib (assume YAGNI). If it turns out later to be a good idea to move the functionality into a library, let's let it have been prototyped in a shipping application first.
looking at the logic of obfs-flash.git, it seems that if we want a bundle that contains websocket, obfs2|websocket and obfs3|websocket, then we will have to spawn 3 flashproxy-clients and each one of them will do its own registration.
That's fine, but each of those flashproxy-clients should be aware of the transport chain it supports, so that it can submit a proper client registraton to the facilitator.
How should we do this? Should we add a CLI swich to flashproxy-client that looks like this: --transport=websocket|obfs3? Should we also be able to specify a default port for that transport so that we can ask people to forward a specific TCP port (like we currently do with DEFAULT_REMOTE_PORT)?
Yeah, I like the --transport option. Although in this case I think it would be --transport=obfs3|websocket (opposite order to match earlier discussion). The default transport would be websocket. You have to add the option to all the registration helpers too (flashproxy-reg-email etc.). See build_register_command in flashproxy-client.
Knowledge of default ports should be built into the super-proxy, not flashproxy-client, I think. What I am imagining is a configuration file that shows what commands to run for the sub-proxies. flashproxy-client takes its remote port from the command line like this. The super-proxy will inform flashproxy-client of its remote port, and then flashproxy-client forwards that port to the registration helpers.
If legacy/trac#5426 (closed) were done, then the super-proxy could do registration, rather than flashproxy-client doing it.
My preference is not to move it to pyptlib (assume YAGNI). If it turns out later to be a good idea to move the functionality into a library, let's let it have been prototyped in a shipping application first.
Sure, I'll hold this off for now.
I should be able to finalise the client side by next week and make a start on legacy/trac#9349 (closed).
export GOPATH=$(pwd)go get # fetches the pt librarymake}}}This version is pretty much a port of the Python Mk. I.V. The next step is to have it open its own listener and connect to Tor using the ExtORPort.This is now what's running at tor1.bamsoftware.com:9500 (173.255.221.44:9500).
I pushed a rough draft of Mk. II that does extended ORPort. The program opens an external and an internal listener in addition to starting the two subproxies. The external listener listens on the bindaddr and forwards data to the proxy chain. The proxy chain is configured to finally connect to the internal listener. The internal listener forwards data to the (extended) ORPort. A FIFO queue (implemented as a Go channel) stores the external connecting IP addresses so they can be given to the extended ORPort by the internal listener.
It seems to do the extended USERADDR thing right, at least when testing locally:
{{{
Aug 29 19:45:09.000 [debug] connection_ext_or_process_inbuf(): Got Extended ORPort data.
Aug 29 19:45:09.000 [debug] connection_ext_or_handle_cmd_useraddr(): Received USERADDR.We rewrite our address from '[scrubbed]:35967' to '[s
crubbed]:38715'.
Aug 29 19:45:09.000 [debug] connection_ext_or_process_inbuf(): Got Extended ORPort data.
Aug 29 19:45:09.000 [debug] connection_ext_or_process_inbuf(): Got Extended ORPort data.
Aug 29 19:45:09.000 [debug] connection_ext_or_process_inbuf(): Received DONE.
I want to polish and refactor the code yet.What I am most concerned about is the scenario when something connects to the external listener (and an address is queued), but something goes wrong in the proxy chain (for example the WebSocket handshake is bad) and there's no following connection to the internal listener. We will have an orphan address in the queue that will be associated with the next connection that survives to the internal listener. In the current implementation, I supposed this will eventually result in deadlock, because the size of the FIFO is limited. Perhaps we can age and expire queue entries.
export GOPATH=$(pwd)go get # fetches the pt librarymake}}}This version is pretty much a port of the Python Mk. I.V. The next step is to have it open its own listener and connect to Tor using the ExtORPort.This is now what's running at tor1.bamsoftware.com:9500 (173.255.221.44:9500).
I pushed a rough draft of Mk. II that does extended ORPort.
Cool, the nested transports appear in bridge-stats:
{{{
bridge-ip-transports =8,obfs3_flash=8,websocket=168
I've added a basic CLI, and also the nifty feature of auto-selecting the local listen port for flashproxy so the user doesn't have to specify it. Available on my github repo.
At the moment, one still needs to manually pick a port for the 'Bridge obfs3_flash' line in torrc, which in theory should not be necessary since it's a local port that's only used locally, but I don't know of a practical way around this. (I don't know Tor well enough.)
I would say that the client side is finished, for now at least (since there are other things to do). It's ready to be tested in The Real World.
Maybe it shouldn't bother me so much, but it's a bit inconvenient to have this ticket wait for a pyptlib API change. I haven't tried the code yet because it doesn't work with the pyptlib in Debian. Unless the new pyptlib API is very close to being merged, is there a chance we could have the client use the old API? For it to get testing, I think depending on an unmerged API change is a real impediment.
Maybe it shouldn't bother me so much, but it's a bit inconvenient to have this ticket wait for a pyptlib API change. I haven't tried the code yet because it doesn't work with the pyptlib in Debian. Unless the new pyptlib API is very close to being merged, is there a chance we could have the client use the old API? For it to get testing, I think depending on an unmerged API change is a real impediment.
It's close to being merged, at least most likely before we're going to distribute this, so I'd rather not do extra backporting work that will be eventually-pointless. I'll re-evaluate in a few weeks if it's still not merged.
If you don't want to bother with my git repo, I've made a .deb of the new pyptlib here:
It's backwards-compatible so it shouldn't break anything else on your system. (There is no "deb source package" since we haven't distributed the new pyptlib yet, to be able to make a ".orig" tarball.)
I've added a basic CLI, and also the nifty feature of auto-selecting the local listen port for flashproxy so the user doesn't have to specify it. Available on my github repo.
Thanks, I merged it.
At the moment, one still needs to manually pick a port for the 'Bridge obfs3_flash' line in torrc, which in theory should not be necessary since it's a local port that's only used locally, but I don't know of a practical way around this. (I don't know Tor well enough.)
I think the way to do this is to put an extra SOCKS listener in front of obfsproxy, that eats Tor's SOCKS request and rewrites it into one pointing at the local SOCKS shim that points to flashproxy-client. But leaving in a hardcoded port number seems fine for now.
What I am most concerned about is the scenario when something connects to the external listener (and an address is queued), but something goes wrong in the proxy chain (for example the WebSocket handshake is bad) and there's no following connection to the internal listener. We will have an orphan address in the queue that will be associated with the next connection that survives to the internal listener. In the current implementation, I supposed this will eventually result in deadlock, because the size of the FIFO is limited. Perhaps we can age and expire queue entries.
I did tests today and it is indeed trivial to wedge the server just by making 10 non-WebSocket connections to it. Here is what you see in the case of a normal obfs3|websocket connection:
2013-09-10 18:46:30 external connection from [scrubbed].2013-09-10 18:46:30 handleExternalConnection: now 1 conns buffered.2013-09-10 18:46:31 internal connection from 127.0.0.1:35389.2013-09-10 18:46:31 connecting to ORPort using remote addr [scrubbed].2013-09-10 18:46:31 handleInternalConnection: now 0 conns buffered.
Notice how an external connection leads to an internal connection that reduces the numbers of conns buffered. Compare this to what happens if you just connect with netcat 12 times:
2013-09-13 15:49:39 external connection from [scrubbed].2013-09-13 15:49:39 handleExternalConnection: now 1 conns buffered.2013-09-13 15:49:53 external connection from [scrubbed].2013-09-13 15:49:53 handleExternalConnection: now 2 conns buffered.2013-09-13 15:52:00 external connection from [scrubbed].2013-09-13 15:52:00 handleExternalConnection: now 3 conns buffered.2013-09-13 15:52:03 external connection from [scrubbed].2013-09-13 15:52:03 handleExternalConnection: now 4 conns buffered.2013-09-13 15:52:04 external connection from [scrubbed].2013-09-13 15:52:04 handleExternalConnection: now 5 conns buffered.2013-09-13 15:52:06 external connection from [scrubbed].2013-09-13 15:52:06 handleExternalConnection: now 6 conns buffered.2013-09-13 15:52:08 external connection from [scrubbed].2013-09-13 15:52:08 handleExternalConnection: now 7 conns buffered.2013-09-13 15:52:10 external connection from [scrubbed].2013-09-13 15:52:10 handleExternalConnection: now 8 conns buffered.2013-09-13 15:52:12 external connection from [scrubbed].2013-09-13 15:52:12 handleExternalConnection: now 9 conns buffered.2013-09-13 15:52:14 external connection from [scrubbed].2013-09-13 15:52:14 handleExternalConnection: now 10 conns buffered.2013-09-13 15:52:18 external connection from [scrubbed].2013-09-13 15:52:25 external connection from [scrubbed].
What I propose to do about this: Let's push conns on a stack, rather than a queue. When a new external connection comes in, it is put on top of the stack. When a new internal connection comes in, it is assigned the most recently seen address, the one at the top of the stack. Zombie connections that somehow didn't survive the proxy chain to make an internal connection will remain at the bottom of the stack, and never assigned to any ExtORPort connection (and we can just prune them if they get too old).
Using a stack rather than a queue means that we are virtually certain to invert the order of two near-simultaneous external connections. But we weren't worried about that anyway. It means that we will quickly forget about addresses that connected to our proxy chain but didn't result in an ExtORPort connection.
I believe this will be good enough to prevent incorrect ExtORPort addresses, assuming that parties are honest or naive, and not malicious. That is, it will prevent assigning the address of someone who happened to just port scan the server, except for a narrow race window. (You would have to do your port scan or netcat after a legitimate external connection, but before that connection gets all the way through the proxy chain.) It's still vulnerable to malicious actors, say someone who makes constant non-WebSocket connections in an effort to poison the metrics data. But I'm not sure we can do better until Mk. III, when we do something to always correctly match up ExtORPort information.
What I propose to do about this: Let's push conns on a stack, rather than a queue. When a new external connection comes in, it is put on top of the stack. When a new internal connection comes in, it is assigned the most recently seen address, the one at the top of the stack. Zombie connections that somehow didn't survive the proxy chain to make an internal connection will remain at the bottom of the stack, and never assigned to any ExtORPort connection (and we can just prune them if they get too old).
I implemented the stack idea and am now running it on the relay. See comment:17 for how to see the code and run a client against it.
I did some review and polishing of the server code and I think it's ready for beta-level testing now.
I think we should move the bamsoftware.com repository to git.torproject.org and call it something other than obfs-flash. I created legacy/trac#9743 (moved) for that.
Yesterday and with the help of dcf, I managed to use obfs-flash (using instructions from comment:17) to bootstrap a Tor client (using the facilitator from comment:40:ticket:9349). Some notes from my adventures:
txscocksx is a dependency for obfs-flash. We should mention that in a README or something, and add it to a setup.py maybe.
I was getting:
Traceback (most recent call last): File "./obfs-flash-client", line 270, in <module> sys.exit(main(*sys.argv[1:])) File "./obfs-flash-client", line 265, in main obfs3_flash(reactor, client, opts.obfs_out, opts.fp_remote) File "./obfs-flash-client", line 217, in obfs3_flash ["websocket"], [fp_client, '127.0.0.1:%s' % fp_local, fp_remote]) File "./obfs-flash-client", line 189, in pt_launch_child StandardIO(sub_protocol, stdin=sub_proc.stdout.fileno(), reactor=reactor)TypeError: __init__() got an unexpected keyword argument 'reactor'
Not sure why. I just removed the , reactor=reactor part.
If the reactor was never started and then obfs-flash-client fails, you get errors like this:
Error in atexit._run_exitfuncs:Traceback (most recent call last): File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs func(*targs, **kargs) File "/usr/local/lib/python2.7/dist-packages/pyptlib-0.0.5-py2.7.egg/pyptlib/util/subproc.py", line 184, in killall cleanup() File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 577, in stop "Can't stop reactor that isn't running.")ReactorNotRunning: Can't stop reactor that isn't running.Error in sys.exitfunc:Traceback (most recent call last): File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs func(*targs, **kargs) File "/usr/local/lib/python2.7/dist-packages/pyptlib-0.0.5-py2.7.egg/pyptlib/util/subproc.py", line 184, in killall cleanup() File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 577, in stop "Can't stop reactor that isn't running.")twisted.internet.error.ReactorNotRunning: Can't stop reactor that isn't running.
I added a commit for obfs-flash-client to give --transport and arbitrary arguments to flashproxy-client - you can customise the torrc based on the example given, and obfs-flash-client will now do the registration automatically!
I've found a problem with the current set-up. I'm not sure why it's happening, but it's something special about our plugin composition.
If I disconnect my browser proxy and reconnect it (by reloading the page), with flashproxy-client tor is able to recover and I can continue browsing. However, with obfs-flash-client, tor is unable to do this and I get a bunch of error messages like this:
Oct 11 19:32:59.000 [notice] We tried for 15 seconds to connect to '[scrubbed]' using exit X1 at A1. Retrying on a new circuit.[x4]Oct 11 19:33:14.000 [notice] We tried for 15 seconds to connect to '[scrubbed]' using exit X2 at A2. Retrying on a new circuit.[x4]Oct 11 19:33:29.000 [notice] We tried for 15 seconds to connect to '[scrubbed]' using exit X3 at A3. Retrying on a new circuit.[x4]
Would either of you know why this is happening? To reproduce:
After talking with Ximin, I changed the transport method names used by the client and server.
Client "obfs3_flash" changes to "obfs3_flashproxy".
Server "obfs3_flash" changes to "obfs3_websocket".
This is to match what we do without the obfs layer, with "flashproxy" on the client and "websocket" on the server. It also looks good in that it looks like we just pasted together actual existing transport names with "_".
Never mind, I can't close this because child tickets are still open. I'll just note that this is complete except for the specific issues in the child tickets.