one stalled stream can cause circuit-level error
When circuit::reactor::Reactor::run_once
drains its queue of things it tried to send earlier, it calls send_relay_cell
without checking whether there is capacity in that stream's sendme window.
send_relay_cell
assumes the caller already verified there is capacity in the stream sendme window, and will return an error if there isn't.
IIUC, this means that if there circuit-level SENDME window capacity, but not stream-level SENDME window capacity, we'll end up tearing down the circuit. IIUC this could happen if the application on the remote side is not reading (quickly enough) from the stream.
I think a band-aid solution would be to add a doc-comment noting this prereq to send_relay_cell
, and add a check at the run_once
call-site. I think we'd want it to continue processing other backlogged messages as long as there's circuit capacity, and then re-queue cells that couldn't be sent due to stream congestion windows.
A better solution might be to better encapsulate stream-level window handling in the stream object, such that it doesn't push cells to the circuit unless there is capacity (and decrements it when doing so).
OTOH stream windows go away when congestion control is implemented, so it might not be worth trying to address this at all if we think it'll be implemented soon (and that it'll be required of all exits soon, if it isn't already)