arti client has high error rate doing 5 MiB transfers with onion services
We recently changed the default test transfer size in "chutney verify" to 5 MiB in both directions to ensure that SENDME windows etc are being properly exercised.
Unfortunately this seems to result in high error rates for arti clients to hidden services. I've reproduced this in chutney, in shadow, and in chutney under shadow.
In chutney (@ head, where the default xfer size is now 5 MiB):
$ CHUTNEY_ARTI=/home/jnewsome/projects/arti/target/x86_64-unknown-linux-gnu/debug/arti-extra tools/test-network.sh --hs-multi-client 1 --flavor hs-v3-arti
...
INFO:chutney.Traffic:Timed out
INFO:chutney.Traffic:Status:
exit via test009ct send-data-73: success (Flushed)
exit via test009ct check-74: success (successful verification)
exit via test011ca send-data-75: success (Flushed)
exit via test011ca check-76: success (successful verification)
test010ht via test009ct send-data-77: success (Flushed)
test010ht via test009ct check-78: success (successful verification)
test010ht via test011ca send-data-79: success (Flushed)
test010ht via test011ca check-80: not done (consumed some)
test012ha via test009ct send-data-81: success (Flushed)
test012ha via test009ct check-82: not done (consumed some)
test012ha via test011ca send-data-83: success (Flushed)
test012ha via test011ca check-84: not done (consumed some)
not-done:3 successes:9 failures:0
INFO:chutney.Traffic:Failure for test010ht via test011ca check-80
INFO:chutney.Traffic:Failure for test012ha via test009ct check-82
INFO:chutney.Traffic:Failure for test012ha via test011ca check-84
Transmission: Failure
The chutney "verify" test has the client send the specified number of bytes to a server, which "echos" back the same bytes. Above, all transfers to the servers (send-data-*) succeeded, but the test times out without ever receiving all of the data back (check-*) for the arti client (to both arti and tor HS), and for the tor client when connecting to an arti HS. check-78, which is for the tor service talking to the tor client, succeeds.
Conversely, the same test with the old transfer size of 10 KB succeeds:
$ CHUTNEY_ARTI=/home/jnewsome/projects/arti/target/x86_64-unknown-linux-gnu/debug/arti-extra tools/test-network.sh --data 10000 --hs-multi-client 1 --flavor hs-v3-arti
...
INFO:chutney.Traffic:Status:
exit via test009ct send-data-1: success (Flushed)
exit via test009ct check-2: success (successful verification)
exit via test011ca send-data-3: success (Flushed)
exit via test011ca check-4: success (successful verification)
test010ht via test009ct send-data-5: success (Flushed)
test010ht via test009ct check-6: success (successful verification)
test010ht via test011ca send-data-7: success (Flushed)
test010ht via test011ca check-8: success (successful verification)
test012ha via test009ct send-data-9: success (Flushed)
test012ha via test009ct check-10: success (successful verification)
test012ha via test011ca send-data-11: success (Flushed)
test012ha via test011ca check-12: success (successful verification)
not-done:0 successes:12 failures:0
Transmission: Success
Completed verify round 1/1 in this bootstrap
This can also be reproduced in arti's integration-shadow test by increasing the transfer size in the tgen configuration files, such as:
diff --git a/tests/shadow/conf/tgen.artionionclient-auth.graphml.xml b/tests/shadow/conf/tgen.artionionclient-auth.graphml.xml
index 8bba4e72a..300b724e9 100644
--- a/tests/shadow/conf/tgen.artionionclient-auth.graphml.xml
+++ b/tests/shadow/conf/tgen.artionionclient-auth.graphml.xml
@@ -11,8 +11,8 @@
<data key="d7">localhost:9000</data>
</node>
<node id="stream">
- <data key="d2">1 MiB</data>
- <data key="d3">1 KiB</data>
+ <data key="d2">5 MB</data>
+ <data key="d3">5 MB</data>
</node>
<node id="pause">
<data key="d0">1,2,3,4,5,6,7,8,9,10</data>