hs-v3: Service circuit retry limit should not close a valid circuit
Context: Lets say a service has 3 intro circuits opened and stable.
Now, imagine one circuit collapses, like for instance the intro point restarted "tor" after an update. Our code is designed to retry 3 times that is once every second and then give up on the intro point.
What it looks like:
run_build_circuit_event() is run and launches intro circuit(s) if we are missing any. For each IP, it will increment the
circuit_retries counter but does not actually look at it to decide to launch or not.
Before that event, also every 1 second,
cleanup_intro_points() checks that every intro point has not expired, fell off the consensus or that
circuit_retries is greater than (>)
MAX_INTRO_POINT_CIRCUIT_RETRIES = 3.
Putting this together, imagine now that the first 3 attempts failed for whatever reasons and then we launch a 4th one because
circuit_retries = 3, it does pass validation of
> MAX_INTRO_POINT_CIRCUIT_RETRIES so then a circuit is launched and that very one succeeds.
circuit_retries is now 4 then the next second,
cleanup_intro_points() removes the IP and closes the valid open established circuit...
I've observed the above a surprising amount of time because booting a tor relay takes some seconds mostly due to the diff-cache parsing.
In a nutshell, we should NOT launch a circuit if we reached the max retries for an intro point. This back and forth of opening a circuit and then deciding that we went over the limit makes no sense in the first place.