Right, so the bug here appears to be that rdsys has an "underlying" bucket it wants to put each bridge into, and you can override that bucket by specifying BridgeDistribution, but as soon as you stop specifying it, you fall back into the original bucket.
Maybe the more expected behavior is to tag the fingerprint inside rdsys so it stays to the (most recent) bucket it requested, once it requests one?
I was told by arma to post this here, but maybe it requires it's own issue, if so, feel free to copy everything below.
I've processed data from CollectTor for 2021-08-01 to now (2022-05-02) to figure out if distribution method is stable over time, and how it changed with rdsys. I found only two event with many bridges changing their distribution method. 1t 2022-02-28 11:20:18 UTC, 1965 bridges changed distribution method (rdsys first introduction), and 2022-03-04 12:27:44 UTC, 144 bridges (fixing #93 (closed)).
There were however many smaller events. On 2022-02-18 13:30:38 (before rdsys) 5 bridges changed distribution mechanism, that's the biggest change on the studied range of time before rdsys (other being one to two bridges changing). After rdsys introduction however, there have been 22 events where > 20 bridges changed, and hundreds with 5 bridges or more.
event list size by count
196 5 bridges changed distribution method 210 6 bridges changed distribution method 275 7 bridges changed distribution method 200 8 bridges changed distribution method 152 9 bridges changed distribution method 133 10 bridges changed distribution method 95 11 bridges changed distribution method 79 12 bridges changed distribution method 69 13 bridges changed distribution method 38 14 bridges changed distribution method 17 15 bridges changed distribution method 30 16 bridges changed distribution method 22 17 bridges changed distribution method 16 18 bridges changed distribution method 5 19 bridges changed distribution method 2 20 bridges changed distribution method 3 21 bridges changed distribution method 1 22 bridges changed distribution method 1 23 bridges changed distribution method 5 25 bridges changed distribution method 1 26 bridges changed distribution method 2 29 bridges changed distribution method 1 31 bridges changed distribution method 2 34 bridges changed distribution method 6 37 bridges changed distribution method
events with > 20 changes, including timestamp
between 2022-03-02 19:34:26 UTC and 2022-03-02 20:04:26 UTC, 22 bridges changed distribution methodbetween 2022-03-03 04:04:26 UTC and 2022-03-03 04:34:26 UTC, 21 bridges changed distribution methodbetween 2022-03-04 04:34:26 UTC and 2022-03-04 05:04:26 UTC, 37 bridges changed distribution methodbetween 2022-03-04 05:04:26 UTC and 2022-03-04 05:34:26 UTC, 34 bridges changed distribution methodbetween 2022-03-04 06:34:26 UTC and 2022-03-04 07:04:26 UTC, 37 bridges changed distribution methodbetween 2022-03-04 07:04:26 UTC and 2022-03-04 07:34:26 UTC, 37 bridges changed distribution methodbetween 2022-03-04 07:34:26 UTC and 2022-03-04 08:04:26 UTC, 37 bridges changed distribution methodbetween 2022-03-04 08:04:26 UTC and 2022-03-04 08:34:26 UTC, 25 bridges changed distribution methodbetween 2022-03-04 08:34:26 UTC and 2022-03-04 09:04:26 UTC, 25 bridges changed distribution methodbetween 2022-03-04 09:04:26 UTC and 2022-03-04 09:34:26 UTC, 25 bridges changed distribution methodbetween 2022-03-04 09:34:26 UTC and 2022-03-04 10:04:26 UTC, 25 bridges changed distribution methodbetween 2022-03-04 10:04:26 UTC and 2022-03-04 10:34:26 UTC, 25 bridges changed distribution methodbetween 2022-03-04 10:34:26 UTC and 2022-03-04 11:04:26 UTC, 26 bridges changed distribution methodbetween 2022-03-04 11:34:26 UTC and 2022-03-04 12:04:26 UTC, 37 bridges changed distribution methodbetween 2022-03-06 10:23:07 UTC and 2022-03-06 10:53:07 UTC, 21 bridges changed distribution methodbetween 2022-03-06 10:53:07 UTC and 2022-03-06 11:23:07 UTC, 23 bridges changed distribution methodbetween 2022-03-08 06:53:07 UTC and 2022-03-08 07:23:07 UTC, 21 bridges changed distribution methodbetween 2022-03-21 20:33:42 UTC and 2022-03-21 20:44:08 UTC, 31 bridges changed distribution methodbetween 2022-03-24 17:44:07 UTC and 2022-03-25 14:24:57 UTC, 29 bridges changed distribution methodbetween 2022-03-28 19:54:57 UTC and 2022-03-28 20:24:57 UTC, 29 bridges changed distribution methodbetween 2022-03-28 20:24:57 UTC and 2022-03-28 20:54:57 UTC, 34 bridges changed distribution methodbetween 2022-03-28 20:54:57 UTC and 2022-03-28 21:18:08 UTC, 37 bridges changed distribution method
Some of the bridges changing distribution in these small events appear only once or twice, but some appear a lot more often.
The 1st, 2nd and 6th most changing relay are alternating between telegram and moat, the 3rd,4th and 5th oscillate between email, https, moat, reserved, settings and telegram (I haven't looked at relays further in the list)
Methodology:
Numbers where obtained by looking at pairs of consecutive descriptors, ignoring empty ones (Serge reboots). If a bridge had a distribution mechanism, disappear for a single cycle and reappear with a new distribution mechanism, this wouldn't be detected as a change. I can update my scripts to handle this case if you think this would be of interest too.
Everything was obtained by using this code, with some modifications to output each stat. I've provided a csv containing every bridge distribution changes that happened as part of a ">=5 bridges changed at once" event, excluding the two know events (rdsys introduction and it acknowledging bridge-distribution-request), to allow further processing for those not familiar with Rust.
Some of the rdsys rotations might be related to restarts of rdsys, as before #104 (closed) rdsys was only accepting changes on bridge-distribution-request on restarts.
I was expecting all rotations to depend on users changing their bridge-distribution-request, but I will be surprise if an operator will do it over a thousand times. Digging into the code it looks like our assignment function depends on the full bridgeline and not just on the fingerprint. So I bet those bridges that change so often their assignment are on dynamic IPs addresses and they get a new assignment each time their IP rotates.
I don't think we have the history of the IP addresses of the bridges, but I'm going to monitor those bridges you fond rotating more often and see if their IPs change.
I don't think we have the history of the IP addresses of the bridges
There is no public history (and I'd assume no private history either), however sanitized descriptors contains sanitized IPs, which change whenever the real IPs change (as well as the first of the month). As far as I can see, CollecTor contains only one descriptor per bridge per month in its archive part, so we can't get old data from there (but maybe there is more such data stored somewhere else?), it's however possible to get a few day worth of data at 1h granularity Apparently I can't see that far and the data is actually on CollecTor, thanks @arma for pointing it out.
I took a very quick look, 0a9a441bf34d17021dc6c3162405249b3f0c9444 appears two times for some timestamps (for instance this descriptor), with different IPv4, and one has an IPv6, not the other
there seems to be other cases, 173F2AF218E4AEE389CA49BA54B9074538578C0A, which is brand new, changed distributions 25 times as of now (between telegram and settings), but its ip hasn't changed.
On 2022-05-12, between 2 successive bridge-server-descriptor at 19:56:48 and 22:58:48, it switched from settings to telegram, to settings, to telegram again (at 19:58:49, 20:28:49 and 21:28:49)
Yes, maintaining stable the assignments will be nice to have. We didn't do it (yet) to make the implementation simpler, I didn't expect the bridges to change so often their distribution mechanism. This will require adding some storage/database to rdsys to keep track of the assignments (#56 (closed)).
My proposal for it would be:
If bridge-distribution-request is set use the requested distributor and save it in the db
If there is an assignment configured for that fingerprint in the db use it
Calculate the assignment from the fingerprint/bridgeline and save it into the db
So once we have seen a bridge before the distribution mechanism can't change unless the operator changes the bridge-distribution-request.
with 3448 bridges in the scriptz dataset but not currently listed by onionoo.
This new set makes me think that:
(a) this bug is still super important, where bridges that change IP addresses or ports change distribution strategy
and also
(b) there sure are a lot of None in that list. Is there a master ticket for the recurring theme of "bridge doesn't get assigned to any distribution strategy"? That issue gets reported a lot, but it's unclear how much of a bug it really is, until we solve the first part (this ticket).
This issue has been waiting for information two
weeks or more. It needs attention. Please take care of
this before the end of
2023-12-22. ~"Needs
Information" tickets will be moved to the Icebox after
that point.
(Any ticket left in Needs Review, Needs Information, Next, or Doing
without activity for 14 days gets such
notifications. Make a comment describing the current state
of this ticket and remove the Stale label to fix this.)
To make the bot ignore this ticket, add the bot-ignore label.