Two channels required for bootstrap
I noticed a change in behavior in
cb62a0b69a7d67b427224ca4c3075b49853a3a1f or thereabouts. tor opens a new SOCKS connection to a client transport plugin while bootstrapping at 50% (if descriptors are not cached) or at 85% (if descriptors are not cached). The upshot is that the flash proxy transport, for which new connections are not free, needs two connected browser proxies in order to bootstrap. Earlier revisions permit bootstrapping and circuit creation with just one connection to the proxy.
I start an external proxy thus:
$ ./flashproxy-client --external --unsafe-logging 2012-12-03 09:02:05 Listening remote on 0.0.0.0:9000. 2012-12-03 09:02:05 Listening remote on [::]:9000. 2012-12-03 09:02:05 Listening local on 127.0.0.1:9001. 2012-12-03 09:02:05 Listening local on [::1]:9001.
The "remote" listener is waiting for WebSocket connections from a browser. The "local" listener is waiting for SOCKS connections from Tor. Then I start Tor to use the proxy:
$ ./src/or/tor ClientTransportPlugin "websocket socks4 127.0.0.1:9001" UseBridges 1 Bridge "websocket 0.0.1.0:1" LearnCircuitBuildTimeout 0 CircuitBuildTimeout 60 Log "info stderr" ... Dec 03 09:02:13.000 [notice] Bootstrapped 10%: Finishing handshake with directory server.
flashproxy-client notices Tor's pending SOCKS connection:
2012-12-03 09:02:13 Local connection from 127.0.0.1:55421. 2012-12-03 09:02:13 SOCKS request from 127.0.0.1:55421. 2012-12-03 09:02:13 Got SOCKS request for 0.0.1.0:1. 2012-12-03 09:02:13 locals (1): [u'127.0.0.1:55421'] 2012-12-03 09:02:13 remotes (0):  2012-12-03 09:02:13 Data from unlinked local 127.0.0.1:55421 (217 bytes). 2012-12-03 09:02:13 locals (1): [u'127.0.0.1:55421'] 2012-12-03 09:02:13 remotes (0): 
Then I open a browser to make a single WebSocket connection which will appear as one of the pluggable transport's "remotes".
flashproxy-client sees the new remote and starts proxying data.
2012-12-03 09:02:17 Remote connection from 127.0.0.1:51321. 2012-12-03 09:02:17 Data from WebSocket-pending 127.0.0.1:51321. 2012-12-03 09:02:17 locals (1): [u'127.0.0.1:55421'] 2012-12-03 09:02:17 remotes (1): [u'127.0.0.1:51321'] 2012-12-03 09:02:17 Linking 127.0.0.1:55421 and 127.0.0.1:51321.
Now bootstrapping continues (over the WebSocket channel) until reaching 85%, and then it says "connections all too old, or too non-canonical" and makes a new SOCKS request:
Dec 03 09:02:18.000 [notice] new bridge descriptor '3VXRyxz67OeRoqHn' (fresh): $7C03D29CA58BE8EED5F483F31A2DEDF6FD7CC444~3VXRyxz67OeRoqHn at 0.0.1.0 Dec 03 09:02:18.000 [notice] We now have enough directory information to build circuits. Dec 03 09:02:18.000 [notice] Bootstrapped 80%: Connecting to the Tor network. ... Dec 03 09:02:18.000 [info] circuit_handle_first_hop(): Next router is [scrubbed]: Connections all too old, or too non-canonical. Launching a new one. Dec 03 09:02:18.000 [notice] Bootstrapped 85%: Finishing handshake with first hop. Dec 03 09:02:18.000 [info] connection_read_proxy_handshake(): Proxy Client: connection to 0.0.1.0:1 successful
flashproxy-client sees the SOCKS request, but because there are no more browser connections forthcoming, everything stalls at this point.
2012-12-03 09:02:18 Local connection from 127.0.0.1:55427. 2012-12-03 09:02:18 SOCKS request from 127.0.0.1:55427. 2012-12-03 09:02:18 Got SOCKS request for 0.0.1.0:1. 2012-12-03 09:02:18 locals (2): [u'127.0.0.1:55421', u'127.0.0.1:55427'] 2012-12-03 09:02:18 remotes (1): [u'127.0.0.1:51321'] 2012-12-03 09:02:18 Data from unlinked local 127.0.0.1:55427 (231 bytes).
I've verified this failure to bootstrap with recent 190c1d49. After bisecting, I think the last commit with the old behavior (bootstrapped 100%) was 751b3aab. There were some compile errors during bisecting so I couldn't narrow it to a specific revision; git reports
There are only 'skip'ped commits left to test. The first bad commit could be any of: 35924435d22c2469ecbe06156c8069a928859d63 e136f7ccb4e671e33b6c92a48df819082291f5c1 4768c0efe3e9471cc367c3740d1a4ba0ab79626c 6cce6241dd4405f6cf21057f9913e07633cf18bb 519c971f6a3b89f1e81cda3c0290d4d943ec0d78 77dac97354974e8a819d8e35ad4e7a76199999b4 32337502f11e6c84e4db8591f5f81c4fc6d1da58 8b14db9628f0e8982e894034e86c8efdd78cff32 15303c32ec9d84aff8de5ed9df28e779c36c6e5c 28f108bcceab59fcf9f27e33065f64bfdb0f159a 7f952da55334d3a3693d1c6e8531fd96730265db f0f87cb68a22feaf552a18b521d3313d843f8793 838743654c1bed2bfe22789ff53a1993c005f176 9ad7ba9f2267a9ee34fafda9356f1fa86633f00f cb62a0b69a7d67b427224ca4c3075b49853a3a1f We cannot bisect more!
Based on the log, cb62a0b6 seems a likely cause of the change: "Use channel_is_bad_for_new_circs(), connection_or_get_num_circs() in main.c".
I thought I would be able to reproduce this with another transport or with a simple SOCKS proxy, showing two connections where there used to be one, but I can't. I see two connections even with the old code. For example with an ssh SOCKS proxy (
ssh -v -D 9001 -N localhost):
debug1: Connection to port 9001 forwarding to socks port 0 requested. debug1: channel 2: new [dynamic-tcpip] debug1: Connection to port 9001 forwarding to socks port 0 requested. debug1: channel 3: new [dynamic-tcpip]
I guess that the difference is that previously, the second connection happens after bootstrapping is complete, while now it happens at 85%. (That is only a guess, I haven't verified it.)
Ticket description from https://lists.torproject.org/pipermail/tor-dev/2012-December/004221.html. nickm suggests log messages to test whether "is bad for new circs" is the reason the first channel isn't getting reused: https://lists.torproject.org/pipermail/tor-dev/2012-December/004246.html.