Trac issueshttps://gitlab.torproject.org/legacy/trac/-/issues2020-06-13T15:41:09Zhttps://gitlab.torproject.org/legacy/trac/-/issues/30344conn_read_callback is called on connections that are marked for closed2020-06-13T15:41:09ZRob Jansenconn_read_callback is called on connections that are marked for closedThere is a bug causing busy loops in Libevent and infinite loops in the Shadow simulator. In my Shadow simulations, I have observed that `conn_read_callback` is getting called on a connection that is marked for close.
This is similar to...There is a bug causing busy loops in Libevent and infinite loops in the Shadow simulator. In my Shadow simulations, I have observed that `conn_read_callback` is getting called on a connection that is marked for close.
This is similar to #5263, and as described there, I believe it is a bug for `conn_read_callback` to be called on sockets that are marked for close.
The bug is problematic in Shadow when the marked connection also wants to flush, but attempting to write the outbuf inside `conn_close_if_marked` fails because the write would block (because the kernel write buffer is already full). Because it still wants to flush, the connection does not get closed, but the connection remains readable causing Libevent to continue looping on `conn_read_callback` until the socket buffer can actually write. This wastes CPU resources in Tor, and causes an infinite loop in Shadow because Shadow's discrete-event time does not advance during this loop.
I found the bug on 0.3.5.8, and it is probably present at least since then.
I think the conn should not be waiting for read events when it is marked. I'm not sure where in the code this assumption is first violated, but the following patch fixed the issue in my Shadow simulations:
```
diff --git a/src/core/mainloop/mainloop.c b/src/core/mainloop/mainloop.c
index 6e2b300..e82c77a 100644
--- a/src/core/mainloop/mainloop.c
+++ b/src/core/mainloop/mainloop.c
@@ -894,6 +894,21 @@ conn_read_callback(evutil_socket_t fd, short event, void *_conn)
}
assert_connection_ok(conn, time(NULL));
+ if (conn->marked_for_close && connection_is_reading(conn)) {
+ /* Libevent says we can read, but we are marked so we will never try
+ * to read again. We will try to close the connection below inside of
+ * close_closeable_connections(), but let's make sure not to cause
+ * Libevent to spin on conn_read_callback() while we wait for the
+ * socket to let us flush to it.*/
+ connection_stop_reading(conn);
+
+ /* Now, if we still have data to flush, then make sure Libevent tells
+ * us when the conn will allow us to write the bytes. */
+ if (connection_wants_to_flush(conn) && !connection_is_writing(conn)) {
+ connection_start_writing(conn);
+ }
+ }
+
if (smartlist_len(closeable_connection_lst))
close_closeable_connections();
}
```Tor: 0.3.5.x-finalGeorge KadianakisGeorge Kadianakishttps://gitlab.torproject.org/legacy/trac/-/issues/28970tor_bug_occurred_(): Bug: ../src/or/hs_client.c:624: setup_intro_circ_auth_ke...2020-06-13T15:42:46ZTractor_bug_occurred_(): Bug: ../src/or/hs_client.c:624: setup_intro_circ_auth_key: Non-fatal assertionIf this is duplicate I'm sorry, but I found nothing similar in old tickets. `Sandbox` is 1 in `torrc`. Set priority and severity of this ticket to appropriate values if it is not so critical.
```
Jan 01 05:31:32.000 [notice] Received rel...If this is duplicate I'm sorry, but I found nothing similar in old tickets. `Sandbox` is 1 in `torrc`. Set priority and severity of this ticket to appropriate values if it is not so critical.
```
Jan 01 05:31:32.000 [notice] Received reload signal (hup). Reloading config and resetting internal state.
Jan 01 05:31:32.000 [notice] Read configuration file "/usr/share/tor/tor-service-defaults-torrc".
Jan 01 05:31:32.000 [notice] Read configuration file "/etc/tor/torrc".
Jan 01 05:31:32.000 [notice] Tor 0.3.4.9 (git-074ca2e0054fded1) opening log file.
Jan 01 05:31:33.000 [warn] tor_bug_occurred_(): Bug: ../src/or/hs_client.c:624: setup_intro_circ_auth_key: Non-fatal assertion !(desc == NULL) failed. (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: Non-fatal assertion !(desc == NULL) failed in setup_intro_circ_auth_key at ../src/or/hs_client.c:624. Stack trace: (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(log_backtrace+0x42) [0x56345cea26e2] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(tor_bug_occurred_+0xb7) [0x56345cebd587] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(hs_client_circuit_has_opened+0x2ca) [0x56345ce8027a] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(circuit_send_next_onion_skin+0x372) [0x56345cdfd192] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(+0x78fc3) [0x56345cd92fc3] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(circuit_receive_relay_cell+0x311) [0x56345cd95211] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(command_process_cell+0x280) [0x56345ce126f0] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(channel_tls_handle_cell+0x273) [0x56345cdf4493] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(+0x11de1c) [0x56345ce37e1c] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(connection_handle_read+0x892) [0x56345ce2e822] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(+0x53a61) [0x56345cd6da61] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x7fc) [0x7f2c7be623dc] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(do_main_loop+0x25f) [0x56345cd6fe3f] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(tor_run_main+0x1165) [0x56345cd72315] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(tor_main+0x3a) [0x56345cd6a32a] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(main+0x19) [0x56345cd6a099] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f2c7a7e6b45] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(+0x500e9) [0x56345cd6a0e9] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] tor_bug_occurred_(): Bug: ../src/or/hs_client.c:443: intro_circ_is_ok: Non-fatal assertion !(!hs_ident_intro_circ_is_valid(circ->hs_ident)) failed. (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: Non-fatal assertion !(!hs_ident_intro_circ_is_valid(circ->hs_ident)) failed in intro_circ_is_ok at ../src/or/hs_client.c:443. Stack trace: (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(log_backtrace+0x42) [0x56345cea26e2] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(tor_bug_occurred_+0xb7) [0x56345cebd587] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(hs_client_send_introduce1+0x352) [0x56345ce7fda2] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(connection_ap_handshake_attach_circuit+0x33c) [0x56345ce1126c] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(connection_ap_attach_pending+0x1a0) [0x56345ce31cb0] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(hs_client_receive_rendezvous_acked+0x79) [0x56345ce803c9] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(rend_process_relay_cell+0x28b) [0x56345cd9c2cb] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(+0x791f7) [0x56345cd931f7] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(circuit_receive_relay_cell+0x311) [0x56345cd95211] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(command_process_cell+0x280) [0x56345ce126f0] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(channel_tls_handle_cell+0x273) [0x56345cdf4493] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(+0x11de1c) [0x56345ce37e1c] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(connection_handle_read+0x892) [0x56345ce2e822] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(+0x53a61) [0x56345cd6da61] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x7fc) [0x7f2c7be623dc] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(do_main_loop+0x25f) [0x56345cd6fe3f] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(tor_run_main+0x1165) [0x56345cd72315] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(tor_main+0x3a) [0x56345cd6a32a] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(main+0x19) [0x56345cd6a099] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5) [0x7f2c7a7e6b45] (on Tor 0.3.4.9 )
Jan 01 05:31:33.000 [warn] Bug: /usr/bin/tor(+0x500e9) [0x56345cd6a0e9] (on Tor 0.3.4.9 )
```
Tor on Linux. Is it related to #27471 or #19926?
**Trac**:
**Username**: torcrashTor: 0.3.5.x-finalDavid Gouletdgoulet@torproject.orgDavid Gouletdgoulet@torproject.orghttps://gitlab.torproject.org/legacy/trac/-/issues/31091Bug stracktrace when pluggable transport cannot bind to port2020-06-13T15:43:26Zs7rBug stracktrace when pluggable transport cannot bind to portI have just setup some new obfs4 fast bridges running latest obfs4proxy from yawning / master (locally compiled on my server) and Tor 0.4.2.0-alpha-dev and encountered a bug stack trace when started to try with the pluggable transport li...I have just setup some new obfs4 fast bridges running latest obfs4proxy from yawning / master (locally compiled on my server) and Tor 0.4.2.0-alpha-dev and encountered a bug stack trace when started to try with the pluggable transport listening on a port < 1024. This is well known, which is why even in the wiki page it is recommended and properly documented to use setcap in order to grant permission to the PT executable to bind to lower ports, but shouldn't it just warn and exit, instead of offering all this:
```
Jul 06 05:51:18.000 [warn] Server managed proxy encountered a method error. (obfs4 listen tcp <ipv4>:<port>: bind: permission denied)
Jul 06 05:51:18.000 [warn] Managed proxy at '/usr/local/bin/obfs4proxy' failed the configuration protocol and will be destroyed.
Jul 06 05:51:18.000 [err] tor_assertion_failed_(): Bug: ../src/feature/client/transports.c:1836: managed_proxy_stdout_callback: Assertion mp->conf_state == PT_PROTO_COMPLETED failed; aborting. (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: Assertion mp->conf_state == PT_PROTO_COMPLETED failed in managed_proxy_stdout_callback at ../src/feature/client/transports.c:1836: . Stack trace: (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: /usr/bin/tor(log_backtrace_impl+0x47) [0x559a576f5d87] (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: /usr/bin/tor(tor_assertion_failed_+0x147) [0x559a576f0ed7] (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: /usr/bin/tor(+0xd4a89) [0x559a575b8a89] (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: /usr/bin/tor(+0x1e4e4b) [0x559a576c8e4b] (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5(event_base_loop+0x6a0) [0x7f6f343265a0] (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: /usr/bin/tor(do_main_loop+0x105) [0x559a57555ed5] (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: /usr/bin/tor(tor_run_main+0x1245) [0x559a57543905] (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: /usr/bin/tor(tor_main+0x3a) [0x559a57540cfa] (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: /usr/bin/tor(main+0x19) [0x559a57540879] (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7f6f32b7b2e1] (on Tor 0.4.2.0-alpha-dev )
Jul 06 05:51:18.000 [err] Bug: /usr/bin/tor(_start+0x2a) [0x559a575408ca] (on Tor 0.4.2.0-alpha-dev )
```
What if it just stopped here:
```
Jul 06 05:51:18.000 [warn] Server managed proxy encountered a method error. (obfs4 listen tcp <ip>:<port>: bind: permission denied)
Jul 06 05:51:18.000 [warn] Managed proxy at '/usr/local/bin/obfs4proxy' failed the configuration protocol and will be destroyed.
```
Or maybe even exit entirely so the operator can know something is really wrong.
Also, I don't know if this is related or not, I will try to make more tests to confirm or infirm this, but under Debian stable the bridges that are not run with setcap don't get a reasonable measured speed. Those with setcap that listen to lower ports have a bw of 2.5 - 3.5 MiB/s and those without setcap have a bw of 12 KiB/s - 70 KiB/s and they are all on the same infrastructure / hardware resources / internet speed. I will do some more digging to confirm this, right now 2 out of 2 obfs4 bridges that were configured without setcap did not get good bandwidth measurement even after 15 days of continuous uptime.Tor: 0.4.0.x-finalAlexander Færøyahf@torproject.orgAlexander Færøyahf@torproject.orghttps://gitlab.torproject.org/legacy/trac/-/issues/31939log spam: Bug: buffers_tls.c:73: buf_read_from_tls: Non-fatal assertion !(buf...2020-06-13T15:46:17ZTaylor Yulog spam: Bug: buffers_tls.c:73: buf_read_from_tls: Non-fatal assertion !(buf->datalen >= INT_MAX - at_most) failed.Nonfatal assert log spamming as seen in #31036.
```
Jun 26 07:01:30.000 [warn] {BUG} tor_bug_occurred_(): Bug: buffers_tls.c:73: buf_read_from_tls: Non-fatal assertion !(buf->datalen >= INT
#_MAX - at_most) failed. (on Tor 0.4.0.5 )
```...Nonfatal assert log spamming as seen in #31036.
```
Jun 26 07:01:30.000 [warn] {BUG} tor_bug_occurred_(): Bug: buffers_tls.c:73: buf_read_from_tls: Non-fatal assertion !(buf->datalen >= INT
#_MAX - at_most) failed. (on Tor 0.4.0.5 )
```
(I assume the `#` in the middle of `INT_MAX` is a paste/transcription artifact, but then again it might not be.)Tor: 0.3.5.x-finalteorteorhttps://gitlab.torproject.org/legacy/trac/-/issues/32058mainloop: make periodic events restartable2020-06-13T15:46:33Zteormainloop: make periodic events restartableWhen we shut down tor, we disable periodic events, but leave their enabled flag set to 1.
See this PR for details:
https://github.com/torproject/tor/pull/1397
I'm not sure if this is the best solution, or how far we should backport.When we shut down tor, we disable periodic events, but leave their enabled flag set to 1.
See this PR for details:
https://github.com/torproject/tor/pull/1397
I'm not sure if this is the best solution, or how far we should backport.Tor: 0.3.5.x-finalDavid Gouletdgoulet@torproject.orgDavid Gouletdgoulet@torproject.orghttps://gitlab.torproject.org/legacy/trac/-/issues/31548hs-v3: Service can pick more than HiddenServiceNumIntroductionPoints intro po...2020-06-13T15:46:40ZDavid Gouletdgoulet@torproject.orghs-v3: Service can pick more than HiddenServiceNumIntroductionPoints intro pointsDuring my testing of #30200, I ended up with service descriptor with 4 intro points even though `HiddenServiceNumIntroductionPoints` is set to 3 (default).
Further investigation confirmed this by adding a log in the `decode_intro_points...During my testing of #30200, I ended up with service descriptor with 4 intro points even though `HiddenServiceNumIntroductionPoints` is set to 3 (default).
Further investigation confirmed this by adding a log in the `decode_intro_points()` function which showed me 4 intro points.
I haven't found out why but one feature of HS is that we launch `HiddenServiceNumIntroductionPoints` + 2 intro circuits in parallel and the first one to finish are picked.
It appears that more than the defined value can finish at the same time and will be picked.Tor: 0.3.5.x-finalDavid Gouletdgoulet@torproject.orgDavid Gouletdgoulet@torproject.orghttps://gitlab.torproject.org/legacy/trac/-/issues/32108tor can overrun its accountingmax if it enters soft hibernation first2020-06-13T15:46:45ZRoger Dingledinetor can overrun its accountingmax if it enters soft hibernation firstI'll put the punchline first: second_elapsed_callback(), which is where we check if it's time to go dormant for hibernation, is no longer called when we've entered soft hibernation.
I assume this is because of the new "periodic event fl...I'll put the punchline first: second_elapsed_callback(), which is where we check if it's time to go dormant for hibernation, is no longer called when we've entered soft hibernation.
I assume this is because of the new "periodic event flag" feature, where we try to avoid calling callbacks when we're in a state that won't need them. See the "online and active" note here:
```
/* This is a legacy catch-all callback that runs once per second if
* we are online and active. */
CALLBACK(second_elapsed, NET_PARTICIPANT,
FL(NEED_NET)|FL(RUN_ON_DISABLE)),
```
The impact is limited, since we stop accepting new connections and new circuits when we enter soft hibernation, but it can still be bad: existing connections and circuits can last for a long time and use a lot of bandwidth.
A secondary impact is that accounting_run_housekeeping() never gets called, which means that the state file never gets updated after we've entered soft hibernation, which means these bandwidth overspends are never recorded to disk.
I think the bug went in during commit 4bf79fa4f which is part of Tor 0.4.0.1-alpha.
The PERIODIC_EVENT_FLAG_NEED_NET flag (what FL(NEED_NET) expands into) checks net_is_disabled(), but there is another function right after net_is_disabled in netstatus.c called net_is_completely_disabled(), and the only difference is that it checks we_are_fully_hibernating() vs we_are_hibernating().
I confirmed the overall bug happens in practice: there's a relay operator in #tor who hit soft hibernation, and then saw his tor proceed to use more bytes than expected. I had him do a 'gdb attach' to his tor and break on 'second_elapsed_callback' and the function never got called.
It seems like the immediate fix, and best backport plan, would be to resume calling second_elapsed_callback even when net_is_disabled(). The longer term plan can be to audit our calls to net_is_disabled() and net_is_completely_disabled() and we_are_hibernating(), with an eye towards "should we be doing this behavior while soft hibernating", and see what other bugs we find.Tor: 0.4.0.x-finalRoger DingledineRoger Dingledine