Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
Trac
Trac
  • Project overview
    • Project overview
    • Details
    • Activity
  • Issues 246
    • Issues 246
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Operations
    • Operations
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Activity
  • Create a new issue
  • Issue Boards
Collapse sidebar

GitLab is used only for code review, issue tracking and project management. Canonical locations for source code are still https://gitweb.torproject.org/ https://git.torproject.org/ and git-rw.torproject.org.

  • Legacy
  • TracTrac
  • Issues
  • #7831

Closed
Open
Opened Dec 30, 2012 by Damian Johnson@atagar

Investigate consensus-tracker's memory usage

The first script that I ported over to stem was the consensus-tracker script which provides the automated emails for the list by the same name...

https://gitweb.torproject.org/atagar/tor-utils.git/blob/HEAD:/consensusTracker.py https://lists.torproject.org/cgi-bin/mailman/listinfo/consensus-tracker/

Moving this turned out to reveal some major issues with stem's ExitPolicy class in terms of memory usage. Those issues are fixed and the script now ran for several days without issue, but then a new type of memory problem surfaced.

Each hour the consensus-tracker makes an instance of the Sampling class, storing up to 192 of them at a time. Individually these our fine, but as the script runs and reaches that threshold the memory starts to stack up.

After a week the consensus-tracker instance on my system was using 75% of the system's memory and started failing to fetch new consensus information (I'm not positive that the memory usage is related to the failures, but seems likely).

So first question, why is stem using more memory than torctl? At a guess there's two issues...

  1. TorCtl likely provided version 2 router status entries while stem provides version 3. A big difference between those two is that version 3 includes the microdescriptor exit policy.

  2. TorCtl's ExitPolicyLine class is far lighter than our ExitPolicy. All it stores is the binary representation of the address, subnet mask, and port range (ie, the bare minimum to have a working match() method). Ours, however, includes IPv6 support and some additional data.

I've made a little hack in my consensus-tracker to drop the exit policy from the router status entries (... actually, the script doesn't use them so this should have zero impact). After a week or so of running this'll confirm or deny that the ExitPolicy is the issue.

If it is then I'll likely make the microdescriptor policies become lighter weight. They only need a subset of the information of a normal policy.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: legacy/trac#7831