Tor died: Caught signal 11
Summary
Recently, Tor relays (at least exit relays) die with signal 11. For now relays keep dying but can be started again after their "crash".
Steps to reproduce:
- Run a Tor relay.
- Wait.
- Profit.
What is the current bug behavior?
After multiple (varying from 3 to 10 times it seems) of these:
[warn] record_rtt_exit: Bug: Conflux: Trying to record exit RTT without a timestamp (on Tor 0.4.8.8 )
[warn] tor_bug_occurred_: Bug: src/core/or/conflux_pool.c:860: record_rtt_exit: Non-fatal assertion !(!leg || leg->linked_sent_usec == 0) failed. (on Tor 0.4.8.8 )
[warn] Bug: Tor 0.4.8.8: Non-fatal assertion !(!leg || leg->linked_sent_usec == 0) failed in record_rtt_exit at src/core/or/conflux_pool.c:860. Stack trace: (on Tor 0.4.8.8 )
[warn] Bug: 0xd7addadc39c <log_backtrace_impl+0x5c> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7addaea7c4 <tor_bug_occurred_+0x1c4> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7add9e984e <conflux_process_linked+0x63e> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7add9e9a17 <conflux_process_linked_ack+0xa7> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7add9adb86 <circuit_clear_cell_queue+0x916> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7add9aaa84 <circuit_receive_relay_cell+0xa74> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7add9aa224 <circuit_receive_relay_cell+0x214> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7adda4697e <command_process_cell+0x96e> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7add94d7fc <connection_or_process_inbuf+0x1fc> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7add948699 <connection_handle_read+0xa79> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7add934a49 <connection_add_impl+0x239> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd8300af91cd <event_base_assert_ok_nolock_+0xbfd> at /usr/local/lib/libevent-2.1.so.7 (on Tor 0.4.8.8 )
[warn] Bug: 0xd8300af515c <event_base_loop+0x58c> at /usr/local/lib/libevent-2.1.so.7 (on Tor 0.4.8.8 )
[warn] Bug: 0xd7add936e4a <do_main_loop+0x10a> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7add923bc8 <tor_run_main+0x128> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] Bug: 0xd7add922544 <tor_main+0x54> at /usr/local/bin/tor (on Tor 0.4.8.8 )
[warn] record_rtt_exit: Bug: Conflux: Trying to record exit RTT without a timestamp (on Tor 0.4.8.8 )
The following happens:
============================================================ T= 1701797968
Tor 0.4.8.8 died: Caught signal 11
0xd7addadc93c <clean_up_backtrace_handler+0xfc> at /usr/local/bin/tor
0xd8302615b6e <pthread_sigmask+0x54e> at /lib/libthr.so.3
0xd7add9e47da <conflux_get_leg+0x1a> at /usr/local/bin/tor
0x7fffffffe8a3 <???> at ???
0xd7add9e47da <conflux_get_leg+0x1a> at /usr/local/bin/tor
0xd7add9ea43d <conflux_pool_free_all+0x39d> at /usr/local/bin/tor
0xd7add9e7e61 <conflux_circuit_has_closed+0x111> at /usr/local/bin/tor
0xd7add9a5d4e <circuit_mark_for_close_+0x35e> at /usr/local/bin/tor
0xd7add9a5987 <circuit_unlink_all_from_channel+0x197> at /usr/local/bin/tor
0xd7add9be0ac <channel_closed+0x4c> at /usr/local/bin/tor
0xd7add94e12f <connection_or_about_to_close+0x3f> at /usr/local/bin/tor
0xd7add937731 <tor_mainloop_free_all+0x691> at /usr/local/bin/tor
0xd7add9372f8 <tor_mainloop_free_all+0x258> at /usr/local/bin/tor
0xd8300af91cd <event_base_assert_ok_nolock_+0xbfd> at /usr/local/lib/libevent-2.1.so.7
0xd8300af515c <event_base_loop+0x58c> at /usr/local/lib/libevent-2.1.so.7
0xd7add936e4a <do_main_loop+0x10a> at /usr/local/bin/tor
0xd7add923bc8 <tor_run_main+0x128> at /usr/local/bin/tor
0xd7add922544 <tor_main+0x54> at /usr/local/bin/tor
When starting the relays that have died again, they will run again for some time. But other relays will still die so I imagine that if they weren't started again, after some time most relays on these servers would be down.
Do note that the [warn] Bug: Non-fatal assertion !(smartlist_len(cfx->legs) <= 0) failed in conflux_pick_first_leg at src/core/or/conflux.c:565.
started in November already. But only in the past ~48 hours or so relays started dying.
What is the expected behavior?
Relays that don't die ;).
Environment
FreeBSD 13.2-RELEASE
Tor version 0.4.8.8 (packages)
Edited by Nothing to hide