Since #3455 (moved), we use SOCKS auth isolation in Tor Browser to separate URL bar domains to different tor circuits. When the user browses to a new URL bar domain, a new SOCKS connection is opened with a SOCKS username/password unique to the site's domain.
By telneting to the tor control port, I observed that immediately after I entered a new URL bar domain in a Tor Browser tab, a new circuit was built and assigned the SOCK_USERNAME and SOCKS_PASSWORD for that URL bar domain.
It seems there would be better performance if we could use an existing, pre-built (yet-unused) circuit when a new SOCKS connection opens, and assign the SOCKS_USERNAME and SOCKS_PASSWORD to the pre-built circuit. That way the user wouldn't have to wait for a circuit to be established after requesting a new website.
I don't know yet whether this is something that can be adjusted by config settings or if we would need to patch tor somehow.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related.
Learn more.
Trac: Summary: A new SOCKS connection should use a pre-built circuit for its stream to A new SOCKS connection should use a pre-built circuit for its first stream
Arthur: can you attach an event log (setevents circ stream) here to show the problem?
That way when other people here try to reproduce it, they'll know whether they're seeing what you see or not.
Incidentally, it is expected behavior that Tor tries to launch a circuit right around the time you make your stream request. That is because Tor should attach the new stream to the preemptively built circuit, and then very soon after it will notice that it doesn't have enough preemptively built circuits hanging around anymore, and launch a new one.
So one of the first debugging steps is to make sure you're not seeing and misinterpreting that behavior.
Finally I inspected the list of open circuits again.
getinfo circuit-status250+circuit-status=1 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings BUILD_FLAGS=ONEHOP_TUNNEL,IS_INTERNAL,NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:03.7698092 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$A47CF4F0B9AFD005C0A5B67A503158923202BE90~dragon1993,$EC116BCB80565A408CE67F8EC3FE3B0B02C3A065~orion BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:04.809515 SOCKS_USERNAME="--unknown--" SOCKS_PASSWORD="0"3 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$EF29F69F49FF1C17CBDCFF8E11E10CD9F6B9DF95~raincat,$6207FC9DDE4EC78F45BB24C53C2EEE63DCC2E2B6~PrivacyRepublic BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:05.790793 SOCKS_USERNAME="arthuredelstein.github.io" SOCKS_PASSWORD="0"4 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$385FE4D32D51C03766F3FBD24B108D6FB82C9E36~CDR,$2CF5110C8F31F45737F807F080ABA90A7AE50781~dragon1993 BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY,NEED_UPTIME PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:06.7896765 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$F60E3B747FE017346005243058FFFE72AA7D59A2~torpoc666,$5937800DEB5B3FDF68BD3392F2AE3A084254FE5F~tor6kryptonit BUILD_FLAGS=IS_INTERNAL,NEED_CAPACITY,NEED_UPTIME PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:07.7904656 BUILT $F1A7CE1B1D558DC24E39B6B30FF217ECCEECF141~Kings,$408807EE2ED6C87F921139ABE0C07E84203A8621~MysticOrb,$B0964415A5380080570845E7CBFCADF87FDCCE5A~Necto7 BUILD_FLAGS=NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:43:55.108644 SOCKS_USERNAME="example.com" SOCKS_PASSWORD="0"7 BUILT $50586E25BE067FD1F739998550EDDCB1A14CA5B2~Jans BUILD_FLAGS=ONEHOP_TUNNEL,IS_INTERNAL,NEED_CAPACITY PURPOSE=GENERAL TIME_CREATED=2016-04-21T17:44:05.795443.250 OK
So I saw two behaviors:
arthuredelstein.github.io used a pre-built circuit (3)
example.com did not use a pre-built circuit, but instead a new circuit (6) was launched
I tried browsing to a series of new domains, and my observation is that the majority of the time Behavior 2 happens. (Behavior 1 is of course better for performance.) I guess this could be that the rate of production of pre-built circuits is slower that the rate at which I was browsing to new sites in my tests. So I guess my question is, what determines the rate at which pre-built circuits are built, and could we make a performance improvement if we increased that rate?
Setting Owner temporarily to nickm, because, in general, tickets like this (#18122 (moved), #18138 (moved), #18229 (moved), #18310 (moved)) are about Integration (e.g. between Tor & TB), and only Nick has architect "ability" to organize this process. (Currently there's even no such option in Component on Trac, and teams prefer to ping-pong such tickets to each other with no progress.)
Removing myself as owner. I can't do anything here until we have an analysis of what to actually try.
what determines the rate at which pre-built circuits are built, and could we make a performance improvement if we increased that rate?
The code is in circuit_predict_and_launch_new, which is called once once per second. We could ramp that up, but it's not a super-cheap function. We could also schedule it to get invoked shortly after the first time any clean circuit becomes dirty.
Also, circuit_predict_and_launch_new seems to have quite a few magic numbers in it to determine the number of circuits to have for hidden services. Adjusting those might get better results.
Removing myself as owner. I can't do anything here until we have an analysis of what to actually try.
what determines the rate at which pre-built circuits are built, and could we make a performance improvement if we increased that rate?
The code is in circuit_predict_and_launch_new, which is called once once per second. We could ramp that up, but it's not a super-cheap function. We could also schedule it to get invoked shortly after the first time any clean circuit becomes dirty.
Thanks for point out this function. I'll try logging from that function and see if I can understand what is happening.
I doubt that increasing the rate of circuit_predict_and_launch_new calls to more than once per second will help, as I wasn't browsing to new sites more than once every several seconds. I think the issue is more likely related to the algorithm in that function that decides whether a new circuit is needed.
I haven't yet -- I was investigating this issue further by using the "ts" command to get timing and realized that the worst bottleneck seems to be not this issue, but #21394 (moved). But I will have another look at this one as well.
How far do you think it makes sense to backport this?
What if we allowed circuits with isolation flags set, so long as the isolation flags were compatible with the stream we wanted to attach? In other words,
(Other code review notes, which I'm totally willing to clean up if you don't feel like it: We make a changes file for every user-visible change in Tor, and we try to base backportable changes on the oldest maint-* branch that they should be backported to.)
How far do you think it makes sense to backport this?
Tor Browser stable is currently using 0.3.1.x, so I think that would be useful.
What if we allowed circuits with isolation flags set, so long as the isolation flags were compatible with the stream we wanted to attach? In other words,
Thanks for the suggestion. I added that check, plus a NULL check, because conn is set to NULL when the code path looks like:
#0 circuit_stream_is_being_handled (conn=0x0, port=443, min=2) at src/or/circuituse.c:993#1 0x00000453de87e352 in circuit_remove_handled_ports (needed_ports=0x453e056b6e0) at src/or/circuituse.c:974#2 0x00000453de867cfb in circuit_get_unhandled_ports (now=1512599788) at src/or/circuitbuild.c:1674#3 0x00000453de867d44 in circuit_all_predicted_ports_handled (now=1512599788, need_uptime=0x7ffe6698b060, need_capacity=0x7ffe6698b064) at src/or/circuitbuild.c:1690#4 0x00000453de87e7cf in needs_exit_circuits (now=1512599788, needs_uptime=0x7ffe6698b060, needs_capacity=0x7ffe6698b064) at src/or/circuituse.c:1082#5 0x00000453de87ea8c in circuit_predict_and_launch_new () at src/or/circuituse.c:1183#6 0x00000453de87ec98 in circuit_build_needed_circs (now=1512599788) at src/or/circuituse.c:1268#7 0x00000453de7b0505 in run_scheduled_events (now=1512599788) at src/or/main.c:1443
(Other code review notes, which I'm totally willing to clean up if you don't feel like it: We make a changes file for every user-visible change in Tor, and we try to base backportable changes on the oldest maint-* branch that they should be backported to.)
(I had accidentally committed an earlier version of this along with my 046acf208bc53a3fa7ea9 commit, and just reverted it with c2c0f83c23986344c4f3fab03)
Thanks! I've re-done the changes file a bit to make it pass lintChanges, to make it "major" per arma's request, and to try to explain the problem from a user's POV. It's now a branch arthuredelstein_18859+1_031 which I'm merging into 0.3.2 and later. If it doesn't cause trouble, it should get backported.
I am okay with a backport here because the patch is fairly simple. But is it worthwhile when it's just a performance problem, and when TB users typically track the latest stable rather than older releases?