Remove data structure containing unique IP address sets
Relays keep a data structure of unique connecting IP addresses for statistics and for informational purposes.
We should consider removing that data structure. There's a privacy risk in gathering unique IP address sets in memory and in reporting aggregate statistics based on them. If we don't need these statistics, we should stop reporting them and stop gathering the underlying data.
The main (and only?) data structure containing unique IP address sets is
src/or/geoip.c. If we remove that data structure, we would also have to remove:
dirreq-v3-ipsline from extra-info descriptors,
- all "bridge statistics" including
bridge-ip-transportslines from extra-info descriptors,
- all "entry node statistics" including
entry-ipsfrom extra-info descriptors,
- the log line
"Heartbeat: In the last %d hours, I have seen %d unique clients.", and
1 and 3 are not used. 2 is used by Metrics to estimate the number of daily bridge users, and we'd need to implement legacy/trac#8786 (moved) before removing bridge statistics. atagar thinks that 4 was added by Sebastian a few years back, so that relay operators with certain simple use cases don't need to open a control port and run something like arm. 5 is used by arm for one of its dialogs, and atagar thinks it's not the end of the world to lose that.