Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
Trac
Trac
  • Project overview
    • Project overview
    • Details
    • Activity
  • Issues 246
    • Issues 246
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Operations
    • Operations
    • Metrics
    • Incidents
  • Analytics
    • Analytics
    • Value Stream
  • Wiki
    • Wiki
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Create a new issue
  • Issue Boards

GitLab is used only for code review, issue tracking and project management. Canonical locations for source code are still https://gitweb.torproject.org/ https://git.torproject.org/ and git-rw.torproject.org.

  • Legacy
  • TracTrac
  • Issues
  • #30291

Closed (moved)
Open
Opened Apr 25, 2019 by David Goulet@dgoulet🐋

Optimize our path selection code

(This is a more complete ticket than #13739 (moved) and thus supersedes it.)

An onion service can be ask to open many rendezvous circuit by a client. With a vanilla Tor, the resource consumption there is roughly 1:2 (service:client) because the client needs to open two circuits, one to the RP and one to the IP and then sends one cell on that circuit to introduce. The service will do a bit less than that because its introduction circuit is already established so it simply needs to open a rendezvous circuit.

However, an attacker (DoS) can make a client skip the RP circuit creation, pin relays in a path to the introduction point and simply open an introduction circuit each time through that path (which in theory should be made very fast and also controlled by the attacker).

This means that the work ratio goes from 1:2 to 1:1 because the client only opens one circuit, send a single cell, close it and repeat.

But, in reality, it gets worst for the service because of the performance of our path selection code. It is heavy. For starter, we need to randomly select 3 nodes from the consensus for each rendezvous circuit. But to select those nodes, we go over all of them for each node to exclude some, include others, go through a series of tests such has tor_addr_family() and node_has_preferred_descriptor() that are quite heavy.

The CPU profile of an overloaded service shows that the majority of CPU is spent in selecting these nodes. Here is a perf report output for the top 5:

+   11.10%  tor      tor                    [.] smartlist_remove                                                                                         
+   10.96%  tor      tor                    [.] fmonty                                                                                                   
+    8.77%  tor      tor                    [.] tor_memeq                                                                                                
+    6.56%  tor      tor                    [.] tor_addr_family                                                                                          
+    5.40%  tor      tor                    [.] tor_addr_is_null             

The top smartlist_remove() comes from:

   - smartlist_remove                                                                                                                                    
      - 11.10% smartlist_subtract                                                                                                                        
           router_choose_random_node                                                                                                                     
           choose_good_middle_server                                                                                                                     
           onion_extend_cpath                                                                                                                            
           onion_populate_cpath                                                                                                                          
           circuit_establish_circuit                                                                                                                     
         - circuit_launch_by_extend_info                                                                                                                 
            - 10.93% launch_rendezvous_point_circuit                                                                                                     

Amazingly enough, we spend more time selecting path than doing crypto on a service that gets DDoS by introduction requests (#29607 (moved)).

There aren't many straight forward approach here. Optimizing the code path bits by bits could be an approach. Most likely, our first step is to have a benchmark for our current path selection code and try to improve on it incrementally.

There are also more hackish approach like pre-building a list of RP nodes at each consensus (or when torrc options are changed like ExcludeNodes) and then randomly picking nodes from there would go faster because they wouldn't need to go through long iterations of tests, it would be already done.

One example is router_choose_random_node() that goes over the entire nodes list just to pick nodes to exclude, then again to test each nodes router_add_running_nodes_to_smartlist() and then potentially does a series of smartlist_subtract() that iterate over the entire list again. This is in theory O(n) + O(n) + O(n) actually theoretically being O(n) but reality is far from the theory here in CPU consumption ;).

To upload designs, you'll need to enable LFS and have admin enable hashed storage. More information
Assignee
Assign to
Tor: unspecified
Milestone
Tor: unspecified
Assign milestone
Time tracking
None
Due date
None
Reference: legacy/trac#30291