This is a subtask of the bigger larger problem in #25882 (moved).
A v2 client does report intro point failures within circuit_about_to_free() but not v3.
Actually, any HS circuit client side is not looked at. The hs_circ_cleanup() was intended for this as the entry point in the HS subsystem but only the service uses it.
Intro circuit failure needs to be noted in the failure cache (hs_cache_client_intro_state_note()).
Rendezvous circuit need to be also somehow handled. If the RP circuit keeps closing on us, we might want to stop trying maybe?
Same goes for HSDir circuit, if they close, client needs to be notified and launch a refetch.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
Related is #26806 (moved) which mentions that possibly because the HSv3 client is not noticing the introduction timeout (as in the ACK never came back), we resend onto that same intro point. Good or bad?
@asn: The initial commits are mostly moving and refactoring code on how we handle the circuit cleanup. It is not that simple so let me know if the approach is sensible. But overall, there needs to be nuanced cleanup between close/free/repurpose.
Related is #26806 (moved) which mentions that possibly because the HSv3 client is not noticing the introduction timeout (as in the ACK never came back), we resend onto that same intro point. Good or bad?
Hmm, questions and answers:
Why doesn't the ACK or NACK come to the client? Is it because the intro point never sent it (why?)? Or because we timeout before receiving it? Or just general Tor network SNAFU?
If the above happens, why would the client decide to resend on the same intro point and same circuit? Is this an explicit decision?
Regarding "Good or bad?" I would say it's bad-ish because if the NACK never came back, I would prefer to retry a different intro point since that one might be suffering networking issues, or being overloaded, or downright maliciously DoSing the service.
PS: #26806 (moved) mentions "rendezvous circuits" in the title, but I think it should be intro circuits
Related is #26806 (moved) which mentions that possibly because the HSv3 client is not noticing the introduction timeout (as in the ACK never came back), we resend onto that same intro point. Good or bad?
Hmm, questions and answers:
Why doesn't the ACK or NACK come to the client? Is it because the intro point never sent it (why?)? Or because we timeout before receiving it? Or just general Tor network SNAFU?
SNAFU is probably the answer. Circuit collapsing, timing out, etc...
If the above happens, why would the client decide to resend on the same intro point and same circuit? Is this an explicit decision?
It doesn't in theory. Depending on the SNAFU (see patch I did), we either flag the intro point in the failure cache (see patch I did) or we go on with our lives maybe retrying a new one.
Regarding "Good or bad?" I would say it's bad-ish because if the NACK never came back, I would prefer to retry a different intro point since that one might be suffering networking issues, or being overloaded, or downright maliciously DoSing the service.
Yes, in theory, that is what is suppose to happen. The patch I did would fix this that is note down the intro point in the failure cache.
PS: #26806 (moved) mentions "rendezvous circuits" in the title, but I think it should be intro circuits
It squashed the latest branch, removed the reverted commits, fixed a typo (wrods -> words), and improved a bit the changes file (to clarify that the reachability delta is positive).
I made a PR just to make sure that CI passes. I will merge when it finishes.