Investigate slightly larger amounts of large queues in negotiation branch
The negotiation branch is reliably causing slightly more relays to have cell queues over 1000 cells than before negotiation.
This can be seen in Shadow using a patch that logs 1x/second whenever relay cell queues exceed 1000 cells: mikeperry/tor@1174255b
The output of that logline is gathered in the sim-results repo: https://gitlab.torproject.org/jnewsome/sim-results
Output can be analyzed with https://gitlab.torproject.org/jnewsome/sim-results/-/blob/main/overload_stats.py
That script gives counts of tuplies of (circs, guards, middles, exits)
that exceed 1k, 2k, 3k, 4k, and >5k cells in their relay queues.
Here is some output for best and worst runs of negotiation, and prior to negotiation, with the same parameters:
./check_overload.sh 2022-02-12-main-negotiation-mr525-27141/
Run 2:
Circs, Relays:
1k: (236, 102, 29, 0)
2k: (5, 4, 1, 0)
3k: (0, 0, 0, 0)
4k: (0, 0, 0, 0)
Nk: (0, 0, 0, 0)
Run 3:
Circs, Relays:
1k: (181, 82, 29, 0)
2k: (5, 2, 3, 0)
3k: (1, 0, 1, 0)
4k: (0, 0, 0, 0)
Nk: (0, 0, 0, 0)
And here's two example runs from before negotiation, with the same parameters:
./check_overload.sh 2022-02-12-main-nocmux-ss50-icw4-newma10d2-delta5-mr525base-27140
Run 1:
Circs, Relays:
1k: (113, 56, 18, 0)
2k: (6, 5, 0, 0)
3k: (0, 0, 0, 0)
4k: (0, 0, 0, 0)
Nk: (0, 0, 0, 0)
Run 2:
Circs, Relays:
1k: (137, 85, 19, 0)
2k: (5, 5, 0, 0)
3k: (1, 1, 0, 0)
4k: (0, 0, 0, 0)
Nk: (0, 0, 0, 0)
We tried eliminating many things, and have not gotten to the cause yet. That progress is documented in the sim plan, in Section 5: https://gitlab.torproject.org/mikeperry/tor/-/blob/cc_shadow_experiments_v2/SHADOW_EXPERIMENTS.txt#L995
Remaining possibilities seem to be negotiation failure, or something happening to desync the congestion control objects. Whatever it is, it is very subtle, and has a small-but-reliably-noticeable effect.