Skip to content

Fix severe reactor ordering problems

eta requested to merge eta/arti:eta/reactor-2.5 into main

A number of severe problems with the circuit reactor were fixed which could cause reordering of cells (which causes relays to terminate the circuit with a protocol violation, as they become unable to decrypt them). These mostly revolve around improper usage of queues:

  • The code assumed that a failure to place cells onto the channel would persist for the duration of a reactor cycle run. However, under high contention, this wouldn't always be the case.
    • This leads to some cells getting enqueued while others go straight through, before the enqueued cells.
    • To fix this, we block sending cells out of the channel while there are still some enqueued.
  • The hop-specific queues queued after encryption, not before. This was very brittle, and led to frequent mis-ordering.
    • This was fixed by making them not do that.

This is !264 (closed) / 5bce9db5 without the refactor part.

Merge request reports