Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • Tor Tor
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 325
    • Issues 325
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 30
    • Merge requests 30
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • The Tor Project
  • Core
  • TorTor
  • Issues
  • #15469
Closed
Open
Issue created Mar 26, 2015 by Karsten Loesing@karsten

Remove data structure containing unique IP address sets

Relays keep a data structure of unique connecting IP addresses for statistics and for informational purposes.

We should consider removing that data structure. There's a privacy risk in gathering unique IP address sets in memory and in reporting aggregate statistics based on them. If we don't need these statistics, we should stop reporting them and stop gathering the underlying data.

The main (and only?) data structure containing unique IP address sets is clientmap in src/or/geoip.c. If we remove that data structure, we would also have to remove:

  1. the dirreq-v3-ips line from extra-info descriptors,
  2. all "bridge statistics" including bridge-stats-end, bridge-ips, bridge-ip-versions, and bridge-ip-transports lines from extra-info descriptors,
  3. all "entry node statistics" including entry-stats-end and entry-ips from extra-info descriptors,
  4. the log line "Heartbeat: In the last %d hours, I have seen %d unique clients.", and
  5. the CLIENTS_SEEN controller event.

1 and 3 are not used. 2 is used by Metrics to estimate the number of daily bridge users, and we'd need to implement legacy/trac#8786 (moved) before removing bridge statistics. atagar thinks that 4 was added by Sebastian a few years back, so that relay operators with certain simple use cases don't need to open a control port and run something like arm. 5 is used by arm for one of its dialogs, and atagar thinks it's not the end of the world to lose that.

Thoughts?

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking