So this means that in this case, sbws would have picked any exit that was not a BadExit, has an acceptable ExitPolicy, and has a consensus weight of at least, well, 2. That's not a lot.
As it turns out, something like 10% of exits have under a 600Kbyte/sec advertised bandwidth. So it seems pretty easy from this weight=1 bootstrap scenario to get paired with an exit that will give poor test results.
Perhaps bwauth path selection should also choose a testing pair from exits/relays with a certain absolute minimum of weight or advertised bandwidth?
Note: Torflow's partitions have a similar issue, but it's actually worse: a relay can get stuck in a low-bandwidth partition forever.
So perhaps this isn't strictly a blocker, because the new behaviour is eventually better. But the fix is so easy, we should just do it.
I suggest we only use the top 75% of exits as the second hop, which should remove the long tail of slow exits. (These slow exits are also more likely to fail to connect, and therefore fail the entire bandwidth test. So this change should also improve sbws efficiency.)
i've not checked which consensus weight corresponds to 600Kbyte/sec advertised bandwidth.
i'm not sure what teor meant with "top 75% of exits", ie. which bandwidth value they think about there.
Looking https://metrics.torproject.org/rs.html#toprelays, showing 100 relays per page, it says "Top Relays by Consensus Weight", the bandwidth column says "Advertised bandwidth", the first relay in the 3rd page (~1/3 ~= 75%) shows 58.1 MiB/s.
We should look which is the actual consensus bandwidth of that 1st relay, if we want to use a hardcoded minimum value.
Instead of that i'd suggest, since bandwidth values changes and we choose exits based on consensus weight, is that:
create a method in relaylist that order them by consensus weight.
You could based it RelayList.exits, then order them sorted(self.relays, key=lambda r: r.consensus_bandwidth)
calculate which would be the relay that corresponds to the 75% in that order
use that value as the minimum value
Hope it helps!
i've not checked which consensus weight corresponds to 600Kbyte/sec advertised bandwidth.
i'm not sure what teor meant with "top 75% of exits", ie. which bandwidth value they think about there.
Looking https://metrics.torproject.org/rs.html#toprelays, showing 100 relays per page, it says "Top Relays by Consensus Weight", the bandwidth column says "Advertised bandwidth", the first relay in the 3rd page (~1/3 ~= 75%) shows 58.1 MiB/s.
We should look which is the actual consensus bandwidth of that 1st relay, if we want to use a hardcoded minimum value.
Instead of that i'd suggest, since bandwidth values changes and we choose exits based on consensus weight, is that:
create a method in relaylist that order them by consensus weight.
You could based it RelayList.exits, then order them sorted(self.relays, key=lambda r: r.consensus_bandwidth)
calculate which would be the relay that corresponds to the 75% in that order
use that value as the minimum value
Yes, I meant a calculated value.
Ideally, we should calculate the value each time we get a new consensus. If we recalculate it for every circuit, it might slow down sbws circuit building.
bug_33009_v5 in my public sbws repo has two patches for review (I've tried to create a merge request for torproject/network-health/sbws maint-1.1 but failed so far; additionally the two patches are based on the branch for legacy/trac#30905 (moved) which needs revision, so maybe having a merge request does not make so much sense).
That said, while having this up for review I am still thinking about some meaningful integration test but thought a lack of that is not a blocker for initial review at least. In particular, given that our current bw authorities situation might make this bug fix more urgent to get deployed.
Trac: Reviewer: N/Ato juga Status: new to needs_review Actualpoints: N/Ato 2.5