Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • Arti Arti
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 233
    • Issues 233
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 20
    • Merge requests 20
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • The Tor Project
  • Core
  • ArtiArti
  • Issues
  • #87
Closed
Open
Issue created Mar 15, 2021 by Nick Mathewson@nickm🌻Owner11 of 14 checklist items completed11/14 checklist items

Profile, identify code bottlenecks, and optimize

I've been avoiding premature optimization in my code so far, but there are probably places where we can get a lot faster. We should identify them via profiling and fix them.

Some situations to experiment with are:

  • Bootstrapping a directory
  • Building a bunch of circuits
  • Running when offline (also see #311, #329 (closed))
  • Bootstrapping failure conditions (see #329 (closed))
    • Primary guard unreachable
  • Primary guards go down after bootstrap
  • Data transfer
  • Data transfer with a lot of circuits
  • Data transfer with a lot of streams
  • Huge number of socks connections/connection attempts at once

We won't know what followup work to do here until we've got some initial profiling information. We should make sure that the tests above are repeatable, so we can re-profile from time to time. The arti-bench crate would be a good place.

We should make sure that we measure CPU and RAM usage; both are critical for mobile users. Probably the tokio-metrics crate would help too.

Tools to use include:

  • CPU
    • cargo flamegraph
  • Memory
    • google-pprof --inuse_space
    • valgrind --tool=dhat
    • valgrind --tool=massif and massif-visualizer
    • https://www.brendangregg.com/FlameGraphs/memoryflamegraphs.html

Sub-issues:

  • #377 (closed) hex decoding shows up a lot, and allocates a lot
  • #383 (closed) Error::Internal allocates a backtrace and is called unconditionally in tor-circmgr.
  • #384 (closed) Intern relay families to save memory.
  • #385 (closed) Intern protover entries to save memory.
  • #386 (closed) Needless slack space in hashmaps
  • #387 (closed) Make GenericRouterstatus smaller.
  • #388 (closed) Call shrink_to_fit on missing microdescs hashmap
  • #389 Use less intermediate RAM to load microdescriptors from sqlite
  • #390 Stream directory responses to save memory and latency
  • #391 (closed) arti-bench should allocate less for receive buffers.
  • #392 (closed) investigate whether we can find faster aes-ctr and/or sha1 implementations.
  • #393 Don't re-verify so much cryptography on startup (maybe)
  • #441 (closed) Use sha1/asm?
  • #442 (closed) Use openssl sha1 and aes?
Edited Aug 03, 2022 by Nick Mathewson
Assignee
Assign to
Time tracking