tor can overrun its accountingmax if it enters soft hibernation first
I'll put the punchline first: second_elapsed_callback(), which is where we check if it's time to go dormant for hibernation, is no longer called when we've entered soft hibernation.
I assume this is because of the new "periodic event flag" feature, where we try to avoid calling callbacks when we're in a state that won't need them. See the "online and active" note here:
/* This is a legacy catch-all callback that runs once per second if * we are online and active. */ CALLBACK(second_elapsed, NET_PARTICIPANT, FL(NEED_NET)|FL(RUN_ON_DISABLE)),
The impact is limited, since we stop accepting new connections and new circuits when we enter soft hibernation, but it can still be bad: existing connections and circuits can last for a long time and use a lot of bandwidth.
A secondary impact is that accounting_run_housekeeping() never gets called, which means that the state file never gets updated after we've entered soft hibernation, which means these bandwidth overspends are never recorded to disk.
I think the bug went in during commit 4bf79fa4 which is part of Tor 0.4.0.1-alpha.
The PERIODIC_EVENT_FLAG_NEED_NET flag (what FL(NEED_NET) expands into) checks net_is_disabled(), but there is another function right after net_is_disabled in netstatus.c called net_is_completely_disabled(), and the only difference is that it checks we_are_fully_hibernating() vs we_are_hibernating().
I confirmed the overall bug happens in practice: there's a relay operator in #tor who hit soft hibernation, and then saw his tor proceed to use more bytes than expected. I had him do a 'gdb attach' to his tor and break on 'second_elapsed_callback' and the function never got called.
It seems like the immediate fix, and best backport plan, would be to resume calling second_elapsed_callback even when net_is_disabled(). The longer term plan can be to audit our calls to net_is_disabled() and net_is_completely_disabled() and we_are_hibernating(), with an eye towards "should we be doing this behavior while soft hibernating", and see what other bugs we find.