Fallback identity mismatches lead to awful behavior
While testing #329 (closed), I found a case that seems to be almost fractally bad. When every fallback has an unexpected ed25519 identity, we get huuuge errors. Here is part of one:
2022-03-11T21:03:56.242150Z WARN tor_dirmgr::bootstrap: error while
downloading: DirClientError(CircMgr(RequestFailed(RetryError { doing: "find
or build a circuit", errors: [(Single(1), Channel { peer: OwnedChanTarget {
addrs: [127.0.0.1:5001], ed_identity: Ed25519Identity {
7oQU1z0DV6+pLNNOfsEJ2bAZH1oaDa3IibkE0vHWg8M }, rsa_identity: RsaIdentity {
$96228149133d75383e4ae471b6ee80c853d6fb6a } }, cause:
Proto(HandshakeProto("Peer ed25519 id not as expected")) }), (Single(2),
Channel { peer: OwnedChanTarget { addrs: [127.0.0.1:5000], ed_identity:
Ed25519Identity { PYct0gFHas1Hofua7s5Iwhb/pA2IjwOF2RcXjxnhwmc },
rsa_identity: RsaIdentity { $cac2e36f67c817c2ecb6284a5af7a8c6c119f5f7 } },
cause: Proto(HandshakeProto("Peer ed25519 id not as expected")) }),
(Single(3), Channel { peer: OwnedChanTarget { addrs: [127.0.0.1:5002],
ed_identity: Ed25519Identity { Baj8TtEMVhFUAcVq0dVVjS4PRubKb890bqOGfHAXQhk },
rsa_identity: RsaIdentity { $8218c2079fe2a2f7291f88afd95871f4608c57f5 } },
cause: Proto(HandshakeProto("Peer ed25519 id not as expected")) }),
(Single(4), Channel { peer: OwnedChanTarget { addrs: [127.0.0.1:5000],
ed_identity: Ed25519Identity { PYct0gFHas1Hofua7s5Iwhb/pA2IjwOF2RcXjxnhwmc },
rsa_identity: RsaIdentity { $cac2e36f67c817c2ecb6284a5af7a8c6c119f5f7 } },
cause: Proto(HandshakeProto("Peer ed25519 id not as expected")) }),
(Single(5), Channel { peer: OwnedChanTarget { addrs: [127.0.0.1:5002],
ed_identity: Ed25519Identity { Baj8TtEMVhFUAcVq0dVVjS4PRubKb890bqOGfHAXQhk },
rsa_identity: RsaIdentity { $8218c2079fe2a2f7291f88afd95871f4608c57f5 } },
cause: Proto(HandshakeProto("Peer ed25519 id not as expected")) }),
...
(Single(95), Channel { peer: OwnedChanTarget { addrs: [127.0.0.1:5002],
ed_identity: Ed25519Identity { Baj8TtEMVhFUAcVq0dVVjS4PRubKb890bqOGfHAXQhk },
rsa_identity: RsaIdentity { $8218c2079fe2a2f7291f88afd95871f4608c57f5 } },
cause: Proto(HandshakeProto("Peer ed25519 id not as expected")) }),
(Single(96), Channel { peer: OwnedChanTarget { addrs: [127.0.0.1:5002],
ed_identity: Ed25519Identity { Baj8TtEMVhFUAcVq0dVVjS4PRubKb890bqOGfHAXQhk },
rsa_identity: RsaIdentity { $8218c2079fe2a2f7291f88afd95871f4608c57f5 } },
cause: Proto(HandshakeProto("Peer ed25519 id not as expected")) })],
n_errors: 96 })))
Probably we are retrying too aggressively. It is not yet clear to me where we need to add a backoff, but we should definitely do that.
This ticket will probably get more specific as we investigate further.