Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • Trac Trac
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Issues 246
    • Issues 246
    • List
    • Boards
    • Service Desk
    • Milestones
  • Monitor
    • Monitor
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value stream
  • Wiki
    • Wiki
  • Activity
  • Create a new issue
  • Issue Boards
Collapse sidebar
  • Legacy
  • TracTrac
  • Issues
  • #12170

Closed (moved)
(moved)
Open
Created Jun 01, 2014 by Nick Mathewson@nickm🍬

Investigate performance issues surrounding count_usable_descriptors()

According to a gprof output generated by Andrea (#11322 (moved)), her busy Tor node called count_usable_descriptors 65368 times, mostly from router_have_minimum_dir_info(). This is expensive because it iterates over all the nodes and does a lot of siphash / digestmap / tor_memeq stuff.

Why are we calling router_have_minimum_dir_info() so much? Almost entirely because of second_elapsed_callback().

But why is router_have_minimum_dir_info() invoking update_router_have_minimum_dir_info so often? If I'm reading these numbers right, it's doing so once every 5 calls. That's not right; it's supposed to cache the result of update_router_have_minimum_dir_info() for a long time, until somebody calls router_dir_info_changed(). Who is doing that?

According to that profile, the top two callers are:

                0.00    0.00    4851/23430       router_add_to_routerlist [266]
                0.00    0.00   17823/23430       channel_do_open_actions <cycle 2> [144]

router_add_to_routerlist() calls should be clustered; they shouldn't cause most of the re-invocations of update_router_have_minimum_dir_info(). So let's look at channel_do_open_actions().

It's calling router_set_status(), which is calling router_dir_info_changed unconditionally!

Two issues there:

  • I'm not sure that router_set_status should be calling router_dir_info_changed at all. Does changing our opinion about a node's is_running status count as a change in whether we know most of the nodes in the network? Should it?
  • It surely shouldn't be calling it unconditionally.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking