Check Maxmind GeoIPLocation Database before distributing
Currently we're consuming Maxmind's (a company registered in the U.S) GeoIPLocation Database in Tor. Not just this goes against the principles of modern privacy that advocates non-reliance on any single organisation/product, but comes with some serious threats. A powerful adversary can impose it's control over Maxmind's database. This can be used to attack tor in a variety of ways:
The Tor Network is constantly monitored for any suspicious spike in nodes, as it may be an indication of an oncoming/undergoing sybil attack. A powerful adversary can coerce Maxmind to map some specific IP address blocks to random countries. This may lead to people/scripts monitoring the network to not feel suspicious about this event, and would result in the adversary staying under the radar.
A large percentage of people don't want the exit of their circuits to be located in certain countries where the communication is under surveillance. The powerful adversary knows this as well. Users generally add a line in their config that allows them to not form a circuit through nodes located in those locations. To overcome this, the adversary can coerce Maxmind to alter it's database to map some particular IP's to locations which the user thinks are havens of free speech.
I propose a system where instead of directly distributed maxmind's db to the users, we first check it for any anomalies.
This is how it works:
The Dir Authorities fetch the GeoIPLocation DBs from all the companies (including Maxmind) located in distinct countries.
Tor Nodes' location (from maxmind) are checked against other DBs as well. The location which appears in a majority of DB is considered authentic.
All the Dir Authorities perform the above two steps periodically and independently of each other, and try to reach on a consensus.
This DB is then distributed to the users along with any modifications from step 2.
What if locations differ in all/most of the DBs?
A case might arise where the locations for an IP differ in all/most of DBs, because these locations are just guesses and hence can be erroneous. However IMO,
Most of the nodes are either run from large datacentres, which in all cases have the right GeoLocation mapped to their IP addr range.
Even if the nodes are run from home on a static IP, usually the whois records are well kept, which help companies such as maxmind fetch data for their DBs.
So, false positives would be very few. Even if there are some, we can ban the IP addr from participating in the network until the issue is resolved. Or we can be a little liberal and allow them to participate given that there isnt a spike in number of nodes recently.
What about DB licenses?
Only the Dir Auths have to pay to get DBs in addition to the freely available maxmind DB. The DB that we will distribute to the users would just be maxmind (with some possible modifications)