Stream starvation due to activation order in circuit_resume_edge_reading_helper()
On or-dev on 22 Nov 2010, Ian Goldberg wrote:
=== A number of groups (the UW people, the UCSD people, Roger(?) and possibly others) have noticed that when many active streams (more than about 4) are opened on a single circuit, 3 or 4 of them get service and the rest starve until those are finished. Note that, for example, most web browsers will open multiple streams at once to fetch parts of web pages.
Here's a plot of 9 active streams on one circuit:
https://thunk.cs.uwaterloo.ca/mashael/broken.png
(This graph was from measurements on a private Tor network on PlanetLab. This was from 0.2.3.0-alpha-dev, but the same behaviour was observed previously on 0.2.2.*.)
Here's Mashael's writeup about the problem.
Our network consists of five nodes. An authoritative directory, a client (also running as Onion Proxy), 3 Tor Onion Routers and a server. We refer to the three onion routers as Entry, Middle and Exit. Our client constructs a circuit through the network, where the first hop is Entry, the second is Middle and the third is Exit. The constructed circuit is used to create multiple streams to the server. Each stream basically carries a request from the client to the server, which replies with 10,000 cells. We repeated this experiment for 2, 5, 9 and 13 streams. The common result is that the last three client streams created "almost" starved the other earlier streams until they fully completed transferring their 10,000 cells.
This problem only occurs when we have a bottleneck OR in the path. We continue to see the problem as long as the bandwidth offered by the bottleneck router is less than approximately 80% of the bandwidth that is offered by the other routers in the circuit. For example, if the bandwidthrate of Entry and Exit is 630 KB/s and 720 KB/s, respectively, the problem is still visible even when the bandwidthrate of Middle is up to 550 KB/s. We verified that by setting the bandwidthrate command for the Middle to a minimum value (20 KB/s) and then gradually increasing it until the starvation of streams disappears from the resulting graphs at approximately 600 KB/s.
The reason the "streams problem" occurs is due to the complicated interaction between Tor's congestion control and libevent. At some point during the experiment, the circuit window is exhausted, which blocks all edge streams. When a circuit level sendme is received at Exit, it resumes edge reading by looping over linked list of edge streams, and calling connection_start_reading() to inform libevent to resume reading. When the streams are activated again, Tor gets the chance to service the first three streams activated before the circuit window is exhausted again, which causes all streams to be blocked again. As an experiment, we reversed the order in which the streams are activated, and indeed the first three streams, rather than the last three, got service, while the others starved.
Our solution is to change the order in which streams are activated. We choose a random edge connection from the linked list, and then we activate streams starting from that chosen stream. When we reach the end of the list, then we continue from the head of the list until our chosen stream (treating the linked list as a circular linked list). It would probably be better to actually remember which streams have received service recently, but this way is simple and effective.
Here's the graph again, with the patched Exit:
https://thunk.cs.uwaterloo.ca/mashael/fixed.png