MAX_REND_FAILURES is 1, but we will try three times
Off-by-2 error counting to MAX_REND_FAILURES for onion services:
In can_relaunch_service_rendezvous_point(), we check
if (circ->build_state->failure_count > MAX_REND_FAILURES ||
circ->build_state->expiry_time <= time(NULL)) {
to decide whether to abort the relaunch.
But the incrementing of failure_count happens in the relaunch, i.e. after this code. Also, it says ">" rather than ">=".
Yet the definition of MAX_REND_FAILURES is
/** How many times will a hidden service operator attempt to connect to a
* requested rendezvous point before giving up? */
#define MAX_REND_FAILURES 1
So when the first attempt fails, failure_count is 0, which is not > 1, so we try again. When the second attempt fails, failure_count is 1, which is also not > 1. It's only after the third attempt fails that we decide that MAX_REND_FAILURES has been reached.
This bug affects legacy onion services, and it will also affect nextgen onion services once legacy/trac#24894 (moved) is fixed.