Tor Proposal to allow relays to signify netflow logging; client and consensus handling of same

Netflow logging is likely done by the upstream of any relay that is paying for bandwidth by gigabyte, or that has DoS protection or other security analysis performed on it. Netflow logging is likely not high enough resolution to do traffic analysis of things like web browsing activity, especially with Tor's netflow padding (https://gitlab.torproject.org/tpo/core/torspec/-/blob/master/padding-spec.txt#L48).

However, per-connection netflow logging is certainly enough to confirm the IP address of an onion service, by doing things like flooding a service with a recognizable traffic pattern, and then obtaining and examining these netflow logs from potential Guard relays or bridges.

If a relay is aware of per-connection netflow logging that it cannot control and cannot get their ISP to turn off, it could say so in its relay descriptor. Since such logging is much more dangerous to users of Guard relays and bridges, we could use this to avoid giving them the Guard flag.

Perhaps less drastically, we could simply make it so only onion services do not use such relays as Guards or bridges. But that still leaves the IP address logging problem for regular clients.

Consider that the IP addresses of Tor users could be used to target advertising for VPNs or other products that privacy-sensitive users may be interested in. Independent of the utility of such advertising, this makes these Tor user IP addresses valuable, and thus more likely to be sold in areas that do not have strong PII rules about logging IP address and selling them. We could use jurisdiction information on PII and where there are no controls on IP address as PII, such relays could be denied the Guard flag and rejected as bridges.

Additionally, it is possible that netflow logging information may already be being bought by the US government for surveillance purposes: https://www.vice.com/en/article/88ng8x/pentagon-americans-surveillance-without-warrant-internet-browsing (answer to that question is classified).

We may also be able to gather information on ISPs that perform netflow logging, and where this is done in jurisdictions where this information is not protected, relays at those ISPs could be denied the Guard flag. With this kind of solution, we would not require all relay operators to be aware of netflow policy themselves, but it would be more work for us.

This proposal likely requires coordination between network-health, netteam, and circumvention/anti-censorship.

Tentative action item list to move this forward:

Ask relay operators to ask their ISPs about netflow connection logging
Build list like https://community.torproject.org/relay/community-resources/good-bad-isps/ to document netflow logging of popular ISPs in use by Guard relays
Map GeoIP country to PII IP address policies
Determine if Guard flag assignment can be performed based on public info, or if we should have relays report it in their descriptors
Encourage research into flooding attacks wrt netflow logging, vs conflux and defenses in https://github.com/mikeperry-tor/vanguards/blob/master/README_TECHNICAL.md#the-bandguards-subsystem
Write Tor Proposal with best course of action, given the above

Edited May 13, 2021 by Mike Perry