Mixing long lived streams with shorter connections causes hidden tor service name (.onion) resolution/routing failures
From one single tor instance with default settings and a single client address, long-lived streams to some .onion addresses seems to eventually break name resolution/routing to other .onion addresses that are short-lived connections.
For example:
- Copy a large file over ssh from Client to server a.onion, b.onion, c.onion. (simultaneously for several hours)
- Connect to server d.onion, e.onion, f.onion for a few commands, then disconnect several times during the long running streams.
Eventually and intermittently, connections to d.onion, e.onion, f.onion fail on the client-side only.
Tests to isolate the problem:
- Testing the connections from another clients works, confirming that the servers d.onion, e.onion, f.onion are still up and it is a problem with the client-side tor
- Forcing a SIGNAL NEWNYMon the client corrects the problem and the client can subsequently connect to d.onion, e.onion, f.onion. The bad side-effect is that all the long-running streams are disconnected (of course).
- Forcing very aggressive circuit rebuilding on the client-side tor mostly solves the problem but also causes more disconnection of streams. I haven't been able to isolate further which one of these options is actually fixing the problem, but providing these are further information for others.
KeepalivePeriod 1
LongLivedPorts
MaxCircuitDirtiness 30
NewCircuitPeriod 30- Stream Isolation SocksPort ... IsolateDestAddrdid NOT seem to help.
I suspect that stale circuits are remaining opened for the long-running streams and that even though they keep streaming, they can no longer resolve/route new .onion requests and thus fail.
I think that tor with default configuration should be robust in these situations and there should not be an obscure .onion name resolution/routing failre to connect to hosts. Most users will not be able to diagnose such a situation and will also have trouble resolving it, ultimately giving the impression that .onion services are unreliable, regardless of whether it is .onion server side failure or client side .onion name resolution or routing failure.
I am not specifically looking for a solution, since I have many workarounds, simply trying to report the issue in a way that may help the tor project.
I have NOT tested the same problem using clearnet IP or domain name based servers, so I cannot report if it is specific to .onion or more general to all client-side tor circuits.