High INTRO1 failure rate with TestingTorNetwork
In shadow simulations with TestingTorNetwork set, hidden service clients get a lot of errors like:
Jan 01 00:06:02.017 [info] handle_introduce_ack_bad(): Received INTRODUCE_ACK nack by $8A269A69067A353059B3C24C0316A3DCA8B3CE19~ [VJtO38jK4XYxnRFX7LOhHTf+kJaGynhhbeeJ21rk30A] at 202.61.225.95. Reason: 1
In the intro point logs, we can see the corresponding log entries such as:
5294:Jan 01 00:05:21.858 [info] handle_introduce1(): No intro circuit found for INTRODUCE1 cell with auth key df0+MAxHZZTPzG4LFC+Pdu1r3mPSRMl6d5GGttl1wmQ from circuit 1033772687. Responding with NACK.
It looks like one of the effects of TestingTorNetwork is to set the min and max intro point lifetime to 10s and 30s. Removing those overrides seems to make the problem go away. https://gitlab.torproject.org/tpo/core/tor/-/blob/main/src/feature/hs/hs_service.c?ref_type=heads#L431
So, one solution is to remove those overrides permanently (or increase them, or make them separate Testing* config params).
It might be worth checking though whether the client behavior ought to be improved; @arma thinks these aggressive parameters combined with a relatively small network might be causing a particularly bad situation for the client's failure cache. Might be worth understanding what's going on there and improving it, even if the issue is less likely in production and with less aggressive intro point rollover.