Circuit creation loop when primary guards are unreachable
I was offline for a few hours today while Tor was running. At some point I went back online but I noticed that Tor was stuck on a circuit creation loop which it did not exit until it marked one of its primary guards as retriable (which can take lots of time). While in the loop, Tor made one circuit per second.
I spent a good part of today debugging this. I think the issue is that our guard algorithm changes the circuit state of circuits that don't use primary guards to
circuit_build_no_more_hops(). Then in
circuit_expire_building() we consider those waiting circuits as not
CIRCUIT_STATE_OPEN and expire them quickly with the 2s build timeout. Then we make more, and then expire them, ad infinitum, until a primary guards becomes retriable and breaks the circle.
Here is the loop
Tor thinks it needs a pre-emptive circuit:
Apr 11 14:47:21.000 [info] circuit_build_times_set_timeout(): Set circuit build timeout to 2s (1500.000000ms, 60000.000000ms, Xm: 525, a: 2.177536, r: 0.121588) based on 403 circuit times Apr 11 14:47:21.000 [info] circuit_predict_and_launch_new(): Have 4 clean circs (3 internal), need another exit circ. Apr 11 14:47:21.000 [info] origin_circuit_new(): Circuit 139 chose an idle timeout of 2967 based on 2875 seconds of predictive building remaining.
Tor picks guard, picks timeouts and connects to it:
Apr 11 14:47:21.000 [warn] No primary guards available. Selected confirmed guard ENiGMA ($42B4F52C5B11E4D39855F654955425B0D5A0598B) for circuit. Will try other guards before using this circuit. Apr 11 14:47:22.000 [warn] Recorded success for confirmed guard ENiGMA ($42B4F52C5B11E4D39855F654955425B0D5A0598B) Apr 11 14:47:22.000 [info] circuit_build_no_more_hops(): circuit built!
Tor marks the circuit as timeout by calling
circuit_expire_building() and starts making a new predictive circuit (loop!):
Apr 11 14:47:23.000 [info] circuit_expire_building(): Deciding to count the timeout for circuit 139 Apr 11 14:47:23.000 [info] circuit_predict_and_launch_new(): Have 4 clean circs (3 internal), need another exit circ.
after a minute finally Tor ditches circuit which has been repurposed as
Apr 11 14:48:22.000 [info] circuit_expire_building(): Deciding to count the timeout for circuit 139 Apr 11 14:48:22.000 [info] circuit_expire_building(): Abandoning circ 139 220.127.116.11:443:2179853168 (state 0,3:waiting to see how other guards perform, purpose 14, len 3) Apr 11 14:48:22.000 [info] pathbias_check_close(): Circuit 139 remote-closed without successful use for reason -3. Circuit purpose 14 currently 0,waiting to see how other guards perform. Len 3.