tor-0.4.8 overestimates <OR> bridge users and underestimates PT bridge users
Look at the graphs below. As bridges have upgraded from tor-0.4.7 to tor-0.4.8, two things have happened simultaneously:
- The estimated number of bridge users using the <OR> protocol (no pluggable transport) has gone up.
- The estimated number of bridge users using any pluggable transports has gone down.
It is no coincidence.
A bug was introduced in tor-0.4.8 that apparently causes the <OR> counter in
bridge-ip-transports
to be incremented
every time the counter for any transport is incremented.
Bridges that formerly reported 0% <OR> and 100% transport T
are now reporting around 50% <OR> and 50% transport T.
Because per-transport directory requests are estimated
according to the ratios
in bridge-ip-transports
, this makes it appear as if half the bridge's users
are <OR> and half are transport T,
which makes the total <OR> count go up and the total pluggable transport count go down.
I believe the bug is in tor, not in metrics code, but I'm opening an issue here because I don't know the cause of the bug in tor yet, and because metrics may have to be aware of the erroneous descriptors that have already been published. @trinity-1686a suspects commit tpo/core/tor@3e18507d, which would make the first affected release tor-0.4.8.4 on 2023-08-23.
I noticed this after tor upgrades on the Snowflake bridges. Here are some excerpts from the thread on tor-dev.
https://lists.torproject.org/pipermail/tor-dev/2023-October/014855.html
I upgraded one of the two Snowflake bridges from 0.4.7.13 to 0.4.8.6 on 2023-09-24. Since then, the number of <OR> IP addresses has been roughly equal to the number of snowflake IP addresses. The ORPort is still not exposed; these are not external vanilla bridge users. Did something change between these versions that might cause PT connections to be double-counted, once for the transport and once for <OR>?
Here are excerpted bridge-extra-info descriptors from before and after the version upgrade. Note the
bridge-ip-transports
lines.@type bridge-extra-info 1.3 extra-info crusty5 91DA221A149007D0FD9E5515F5786C3DD07E4BB0 master-key-ed25519 1y9CAtinlbrhDuYBSOBNiCU9Ck1lcY7LErxnzhtxVks published 2023-09-19 10:46:08 transport snowflake bridge-ip-versions v4=13528,v6=1384 bridge-ip-transports <OR>=8,snowflake=14912
@type bridge-extra-info 1.3 extra-info crusty5 91DA221A149007D0FD9E5515F5786C3DD07E4BB0 master-key-ed25519 1y9CAtinlbrhDuYBSOBNiCU9Ck1lcY7LErxnzhtxVks published 2023-09-29 17:33:20 transport snowflake bridge-ip-versions v4=2880,v6=336 bridge-ip-transports <OR>=1632,snowflake=1592
https://lists.torproject.org/pipermail/tor-dev/2023-October/014858.html
The effect of this apparent bug is that the lower bound of per-country per-transport user count intervals goes to zero. See the first attached graph. The snowflake-01 bridge upgraded to 0.4.8 on 2023-10-03; snowflake-02 upgraded on 2023-09-24. Whereas formerly, the low–high intervals were so narrow as to be indistinguishable from a line, now they extend all the way down to the x-axis.
The formula for the lower bound is:
low = max(0, country_reqs + transport_reqs − total_reqs)
On a bridge that hides its ORPort and runs just one transport, we have transport_reqs ≈ total_reqs, and so the above just becomes
low = max(0, country_reqs)
But with the apparent metrics bug in 0.4.8.6, total_reqs is twice as large as it should be (i.e., 2 × transport_reqs ≈ total_reqs), which means the formula becomes
low = max(0, country_reqs − transport_reqs)
which, since country_reqs < transport_reqs, always becomes zero.
The upper bound of the interval is unaffected; its formula is
high = min(country_reqs, transport_reqs)
The other side effect is that directory requests that ought to be attributed entirely to a pluggable transport are instead being ascribed 50/50 to that transport and the plain OR protocol. "We approximate [directory requests by transport] by multiplying the total number of requests with the fraction of unique IP addresses by transport." See, in the second attached graph, how estimated pluggable transport users have declined and OR protocol users have increased, in an almost inverse relationship, roughly in step with the 0.4.7→0.4.8 upgrade.
https://metrics.torproject.org/userstats-bridge-transport.html?start=2023-07-11&end=2023-10-09&transport=%21%3COR%3E&transport=%3COR%3E https://metrics.torproject.org/versions.html?start=2023-07-11&end=2023-10-09