Call of connection_write_to_buf for CONN_TYPE_CONTROL allows immediate flush of the outbuf. Then if code tries to log something and control_event_logmsg is involved then it call of connection_write_to_buf recursively. Consequence calls blocked by ++disable_log_messages only then.
Three ways to fix:
Remove ability for immediate flush of the outbuf by connection_write_to_buf
Check in_connection_handle_write flag at start of connection_handle_write
Guard call of connection_handle_write by disable_control_logging and enable_control_logging
Every way have negative and positive impacts.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related.
Learn more.
Not sure about sanity of 2 and 3 way either. Only proper way to escape recursion is to separate writing to outbuf and flushing of outbuf by removing ability for immediate flush of outbuf. Do we really need to flush it such way for control port:
Do we really need to flush it such way for control port:
{{{
conn->outbuf_flushlen-len < 1<<16 &&
conn->outbuf_flushlen >= 1<<16
}}}
what is profit, when it happens, why it happens?
Oh, oh, I know this one.
Back in the day, Vidalia would ask a control port question with a really big answer (like "give me all the microdescs"), and Tor would put the answer on its control port, but then the outbuf would get too big, and the "is there too much on the outbuf" check would trigger, killing the connection.
Maybe enough of the rest of the code has changed that this bad behavior wouldn't happen anymore? Or maybe it is just like it was?
Back in the day, Vidalia would ask a control port question with a really big answer (like "give me all the microdescs"), and Tor would put the answer on its control port, but then the outbuf would get too big, and the "is there too much on the outbuf" check would trigger, killing the connection.
Maybe enough of the rest of the code has changed that this bad behavior wouldn't happen anymore? Or maybe it is just like it was?
Structure for outbuf was changes since then, there are no need for connection's buf size limits anymore and code do not closes connection for such reason anymore.
But comment for ab838bddb89fa3f37eeada29e8c10d16873b5e86 says:
Flush local controller connection buffers periodically as we're writing to them, so we avoid queueing 4+ megabytes of data before trying to flush.
Is it still possible to queue so many bytes for control connection, why?
I think we can remove that entire block from connection_write_to_buf_impl_. So, way #1 it is. Roger, does anything bad happen if we take this out? I just tried shoving a huge pile of stuff onto an output buffer, and everything was fine.
Is it still possible to queue so many bytes for control connection, why?
I think "GETINFO desc/all-recent" can do at least that much.
See branch decouple-write in my public repository?