Brainstorm and analyze heuristics to guess that a bridge might be offline or blocked
In the upcoming "subscription model" plan (team#42), we envision several use cases. Here are the first three:
-
(1) bridge moves to a new IP address
-
(2) bridge goes offline
-
(3) bridge gets blocked
Case 1 is the easiest, since if the bridge is at a new IP address, we know this because we have a newer bridge descriptor for it. So if a client comes asking for a replacement, we just give them a new bridge line based on this new bridge descriptor.
For case 2, we want to give users a deterministic replacement -- but only if the bridge is actually offline. So we need some scanning mechanism to discover and/or verify which bridges have gone offline, and it should learn an answer quickly enough to be relevant for the subscription model style replacement.
For case 3, we also want to give users a deterministic replacement, but it has to come from the "dynamic bridge pool" subset, and also we only want to offer a replacement if we believe the bridge is actually blocked. Case 3 is also fun because we don't want to test a given bridge from in-country until we hit a threshold of suspicion that it is blocked.
This umbrella ticket aims to collect ideas for (a) what information sources we can use to decide that a given bridge is worth testing now, and (b) think about architectures for active scanning that go well with these three use cases plus the information sources from 'a'.
Potential data sources:
- Usage metrics (rdsys#112)
- Client reports
- Reported by tor process (tpo/core/arti#717)
- Reported by Tor Browser
- Measurement probes
- Indirect, external scanning (e.g., spooky scan, censored planet)
- Scans from within the censored region (e.g., OONI, our own censorship probe)
Consumers of this information:
- Subscription model for bridge distribution (team#42)
- Reputation-based bridge distribution (https://gitlab.torproject.org/tpo/anti-censorship/lox/lox-overview/-/issues/5)