Bandwidth stats info leak upon close of circuits with queued cells
We received a tor bug bounty report from jaym
about a congestion attack variant that can cause bandwidth stats watermark.
The bug uses the fact that Tor increments the read bytes counter before adding the cell to the output buffer: If the circuit gets killed before the cell gets relayed to the next hop, then the write bytes counter will never be updated, making the read bytes counter having a higher value than the write bytes counter. The attacker could exploit this assymetry to find relays using their bandwidth graph.
The attacker can kill the circuit using the OOM killer by saturating its output queue with cells until circuits_handle_oom()
gets called and kills the circuit.
We should figure out whether this attack is practical (the paper claims it is) and whether it's worthwhile fixing it. Just fixing this issue won't solve the general issue of congestion attacks, and it might even allow other kinds of attacks.
The most practical fix right now seem to be to hack circuit_handle_oom()` to actually decrement the read counters before killing a circuit. However, that's a very specific fix that might solve this very specific bug, but leave the rest of the bug class open.
Another approach would be removing the bandwidth graphs, or aggregating them over a greater period of time, or adding noise. We should consider these approaches carefully since bandwidth graphs see great use by academic papers and also by relay operators (to gauge their contribution).