Trac issueshttps://gitlab.torproject.org/legacy/trac/-/issues2020-06-13T16:06:25Zhttps://gitlab.torproject.org/legacy/trac/-/issues/24804Run an opt-in process for relay operators to become fallbacks in 20182020-06-13T16:06:25ZteorRun an opt-in process for relay operators to become fallbacks in 2018This involves mailing tor-relays and asking if stable relay operators want to become fallbacks.This involves mailing tor-relays and asking if stable relay operators want to become fallbacks.Tor: 0.4.0.x-finalColin ChildsColin Childshttps://gitlab.torproject.org/legacy/trac/-/issues/24797Add an option that makes Tor use fewer connections2020-06-13T15:19:48ZteorAdd an option that makes Tor use fewer connectionsTor is willing all the file descriptors the OS tells it are available for the current user.
So to enforce a connection limit, operators need to drop the user file descriptor limit.
But if the OS is wrong, or there are multiple tor insta...Tor is willing all the file descriptors the OS tells it are available for the current user.
So to enforce a connection limit, operators need to drop the user file descriptor limit.
But if the OS is wrong, or there are multiple tor instances under the user, tor can easily exceed this limit.
Or machine hits a kernel, RAM, or CPU limit first, then the user might want to artificially limit connections.
Right now, we have ConnLimit, which looks like it limits connections, but it's actually a minimum.
And we have the out of socket check, but DisableOOSCheck is the default.
Does enforcing a connection limit require us to set DisableOOSCheck 0?
Then we should move this ticket to 0.3.4, and open one to improve the out of socket check.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24787Revise the fallback stability requirements based on current network load2020-06-13T16:06:21ZteorRevise the fallback stability requirements based on current network loadIn #24785, we decreased the fallback stability requirements due to the increased network load from December 2017.
We should work out whether we can raise them again.In #24785, we decreased the fallback stability requirements due to the increased network load from December 2017.
We should work out whether we can raise them again.https://gitlab.torproject.org/legacy/trac/-/issues/24779Investigate Windows 11(?) AF_UNIX support2020-06-13T15:19:44ZteorInvestigate Windows 11(?) AF_UNIX supportWindows has added AF_UNIX to their developer builds:
https://blogs.msdn.microsoft.com/commandline/2017/12/19/af_unix-comes-to-windows/
We should make sure this doesn't cause any security issues in Tor.
And maybe we should support it eve...Windows has added AF_UNIX to their developer builds:
https://blogs.msdn.microsoft.com/commandline/2017/12/19/af_unix-comes-to-windows/
We should make sure this doesn't cause any security issues in Tor.
And maybe we should support it eventually.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24777Make relays try IPv6 ORPorts for directory uploads and downloads2020-06-13T15:19:44ZteorMake relays try IPv6 ORPorts for directory uploads and downloadsUsing IPv6 ORPorts will make relay behaviour consistent with future client behaviour.
The only difference is that relays will also use IPv4 DirPorts, but clients will not.
Relays should try to upload descriptors using IPv4 and IPv6 as ...Using IPv6 ORPorts will make relay behaviour consistent with future client behaviour.
The only difference is that relays will also use IPv4 DirPorts, but clients will not.
Relays should try to upload descriptors using IPv4 and IPv6 as well, to detect authority connectivity issues.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24774Edit prop279 to support alternative name representations and non-English lang...2020-06-13T15:19:42ZTracEdit prop279 to support alternative name representations and non-English languagesAttached is a `git diff` patch with edits to torspec.git/proposals/279-naming-layer-api.txt (patch on 13cbcbc). Inspired by discussion on tor-dev:
https://lists.torproject.org/pipermail/tor-dev/2017-December/012743.html
https://lists....Attached is a `git diff` patch with edits to torspec.git/proposals/279-naming-layer-api.txt (patch on 13cbcbc). Inspired by discussion on tor-dev:
https://lists.torproject.org/pipermail/tor-dev/2017-December/012743.html
https://lists.torproject.org/pipermail/tor-dev/2017-December/012746.html
Brief conceptual overview of the most significant changes:
* Support Name System API plugins which transform self-contained alternative representations of the data in .onion names. These can be safely configured for a global wildcard '*', then sandboxed with neither network nor filesystem access.
* Add UTF-8 support, so as not restrict the Name System API to users of American English.
* Specify failure status codes which will cause name resolution attempts to stop, even if the name may match the TLD for other plugins at a lower priority. This is useful if a plugin configured for a '*' TLD definitively recognizes that a name is invalid, _e.g._ due to checksum failure.
* Other changes which support the foregoing objectives.
Please review and commit.
**Trac**:
**Username**: nulliusTor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24741Consider redacting usernames in notice level logs2020-06-13T15:19:32ZteorConsider redacting usernames in notice level logsmacOS uses a person's first and last name as their username by default.
(I'm not sure what Windows or other unix systems do.)
This means that logs can contain a user's full name, and sometimes (at info level?) their IP address.
We shou...macOS uses a person's first and last name as their username by default.
(I'm not sure what Windows or other unix systems do.)
This means that logs can contain a user's full name, and sometimes (at info level?) their IP address.
We should think about the risks here, particularly for users that post logs to the bug tracker.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24735Always check for the null address when calling address functions2020-06-13T15:19:28ZteorAlways check for the null address when calling address functionsThese address functions have never had return values:
* node_get_prim_dirport()
* node_get_pref_ipv6_orport()
We should make sure we always check for the null address when we call them.These address functions have never had return values:
* node_get_prim_dirport()
* node_get_pref_ipv6_orport()
We should make sure we always check for the null address when we call them.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24734Remove the return value of fascist_firewall_choose_address_node()2020-06-13T15:19:28ZteorRemove the return value of fascist_firewall_choose_address_node()Let's check for the null address instead.Let's check for the null address instead.Tor: 0.3.4.x-finalNeel Chauhanneel@neelc.orgNeel Chauhanneel@neelc.orghttps://gitlab.torproject.org/legacy/trac/-/issues/24731Stop checking routerinfos for addresses when we use microdescs for circuits2020-06-13T15:19:25ZteorStop checking routerinfos for addresses when we use microdescs for circuitsDirectory mirrors and clients that FetchUselessDescriptors check for IPv4 and IPv6 addresses in the following order:
* routerinfos (descriptors)
* routerstatus (consensus)
* microdescriptors
But they should check using the following ord...Directory mirrors and clients that FetchUselessDescriptors check for IPv4 and IPv6 addresses in the following order:
* routerinfos (descriptors)
* routerstatus (consensus)
* microdescriptors
But they should check using the following order:
* bridge routerinfos (descriptors)
* routerstatus (consensus)
If using microdescriptors for circuits:
* microdescriptors
Otherwise:
* routerinfos (descriptors)
There is code that implements this algorithm in commits decb0636e2, 1d1c927b9a, and 4979ec3c17 of my bug23975_tree branch.
But this adds overhead to every address lookup when building circuits.
Maybe we can make it faster by:
* not parsing routerinfos or microdescs if we aren't using them for circuits, or
* putting a canonical address in node_t, updating it whenever ri, rs, or md change, and always using itTor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24712Client builds and timeouts tons of circs to connect to HS2020-06-13T15:19:22ZGeorge KadianakisClient builds and timeouts tons of circs to connect to HSI was testing #23101 by connecting to a few hidden services, and I noticed that my client would create about 7 intro circs and 7 rend circs before finally connecting to a single HS. Apparently it was creating circs and timing them and th...I was testing #23101 by connecting to a few hidden services, and I noticed that my client would create about 7 intro circs and 7 rend circs before finally connecting to a single HS. Apparently it was creating circs and timing them and then making more.
I'm not sure if this is caused by the current damaged state of the network, but it definitely is not making it better if a single client is causing so many circuits.
I'll be attaching logs in the next comment.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24694sched: Use the socket RTT in KIST to compute a more accurate extra space2020-06-13T15:19:19ZDavid Gouletdgoulet@torproject.orgsched: Use the socket RTT in KIST to compute a more accurate extra spaceThis comes from a discussion in #24665 that Yawning started:
Assuming the scheduler is called significantly faster than the RTT of most links (read that as "If 10 ms is lower than the RTT of most if not all links"), you can/should reduc...This comes from a discussion in #24665 that Yawning started:
Assuming the scheduler is called significantly faster than the RTT of most links (read that as "If 10 ms is lower than the RTT of most if not all links"), you can/should reduce sock_buf_size_factor as well, because you aren't going to get a full congestion window worth of ACKs back between scheduler calls in common cases.
There isn't a good "one size fits all" solution. Setting it too low will gimp performance on fast low latency links, setting it too high right now bloats the various buffers. I would personally opt more toward avoiding the latter given all the Fun that's happening.
In the `struct tcp_info` we use in KIST, `tcpi_rtt` gives the smoothed RTT estimate (and `tcpi_rttvar` the RTT variance if you need it), which is probably sufficient to give a better reasonable guess here, as a first pass, I would recommend doing something based on the the scheduler interval to smoothed RTT ratio, with a hard maximum at 1.0.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24668sched: scheduler_compare_channels() function will never pick a channel with n...2020-06-13T15:19:13ZDavid Gouletdgoulet@torproject.orgsched: scheduler_compare_channels() function will never pick a channel with no active circuits.In the schedulers, scheduler_compare_channels() is used to decide which channel is 'best' to write data from. It delegates to circuitmux_compare_muxes(), which delegates to ewma_cmp_cmux().
But ewma_cmp_cmux() will never prefer a cmux ...In the schedulers, scheduler_compare_channels() is used to decide which channel is 'best' to write data from. It delegates to circuitmux_compare_muxes(), which delegates to ewma_cmp_cmux().
But ewma_cmp_cmux() will never prefer a cmux with no active circuits on it! So a channel without active circuits will never be picked by the scheduler to flush from a circuit, which is what triggers flushing from its destroy queue. So the channel will stay around forever, never flushing.
To fix this one, we probably have to fix ewma_cmp_cmux() to look at destroy cells too (somehow). And we still need to make sure that the scheduler's position in the heap changes when the data considered by scheduler_compare_channels() changes [*].
[*] I'm not convinced that we're even doing this right with the current scheduler_compare_channels() code. :(Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24667OOM needs to consider the DESTROY queued cells2020-06-13T15:19:13ZDavid Gouletdgoulet@torproject.orgOOM needs to consider the DESTROY queued cellsOur OOM is only looking a the circuit queue cells and HS descriptors to free up memory.
We need to teach it to cleanup DESTROY cells in case cleaning up the circuits is not enough.
This isn't that trivial because while cleaning up circ...Our OOM is only looking a the circuit queue cells and HS descriptors to free up memory.
We need to teach it to cleanup DESTROY cells in case cleaning up the circuits is not enough.
This isn't that trivial because while cleaning up circuits in the OOM handler, we will also send DESTROY cells for those thus allocating memory. But also not sending those will affects other relays hanging on dead circuits.
All in all, this is an interesting challenge but there might be something smart to do even if not perfect.
The idea here is to avoid an attack that takes advantage of a bug in tor that can fill up the DESTROY cell queue and our OOM just can't do anything about it.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24630Stop initialising rust git submodules, travis does this automatically2020-06-13T15:19:00ZteorStop initialising rust git submodules, travis does this automaticallyisis told me to do this.isis told me to do this.Tor: 0.3.2.x-finalhttps://gitlab.torproject.org/legacy/trac/-/issues/24629Activate osx builds on travis, at low priority2020-06-13T15:29:57ZteorActivate osx builds on travis, at low priorityWe want to activate osx builds on Travis CI.
But they can be very slow, so we need to customise the settings.
If possible, we want osx builds to behave as follows:
* if there is a pending build, and more commits are pushed to the branch...We want to activate osx builds on Travis CI.
But they can be very slow, so we need to customise the settings.
If possible, we want osx builds to behave as follows:
* if there is a pending build, and more commits are pushed to the branch, cancel the pending build and re-queue the latest commits
* let travis show builds as "complete" if osx takes a long time, but still show osx failures eventuallyTor: 0.3.5.x-finalteorteorhttps://gitlab.torproject.org/legacy/trac/-/issues/24610assertion !(status == HS_CLIENT_FETCH_HAVE_DESC) failed2020-06-13T15:18:51Zcypherpunksassertion !(status == HS_CLIENT_FETCH_HAVE_DESC) failed```
Mmm dd hh:01:30.000 [notice] Your system clock just jumped 2000 seconds forward; assuming established circuits no longer work.
Mmm dd hh:02:00.000 [notice] Tor has successfully opened a circuit. Looks like client functionality is wor...```
Mmm dd hh:01:30.000 [notice] Your system clock just jumped 2000 seconds forward; assuming established circuits no longer work.
Mmm dd hh:02:00.000 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Mmm dd hh:02:00.000 [notice] Tor has successfully opened a circuit. Looks like client functionality is working.
Mmm dd hh:03:07.000 [warn] tor_bug_occurred_(): Bug: src/or/hs_client.c:267: retry_all_socks_conn_waiting_for_desc: Non-fatal assertion !(status == HS_CLIENT_FETCH_HAVE_DESC) failed. (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: Non-fatal assertion !(status == HS_CLIENT_FETCH_HAVE_DESC) failed in retry_all_socks_conn_waiting_for_desc at src/or/hs_client.c:267. Stack trace: (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/bin/tor(log_backtrace+0x52) [0x3d56c44a32] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/bin/tor(tor_bug_occurred_+0xc5) [0x3d56c61835] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/bin/tor(hs_client_dir_info_changed+0x116) [0x3d56c23fc6] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/bin/tor(networkstatus_set_current_consensus+0x3c2) [0x3d56b15f02] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/bin/tor(connection_dir_reached_eof+0x14db) [0x3d56bf23fb] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/bin/tor(+0x10ff39) [0x3d56bc8f39] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/bin/tor(+0x5075d) [0x3d56b0975d] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/lib64/libevent-2.1.so.6(+0x229ed) [0x3bd918df9ed] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/lib64/libevent-2.1.so.6(event_base_loop+0x4f7) [0x3bd918e07f7] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/bin/tor(do_main_loop+0x29d) [0x3d56b0a76d] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/bin/tor(tor_main+0xe2d) [0x3d56b0d62d] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/bin/tor(main+0x28) [0x3d56b05ad8] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /lib64/libc.so.6(__libc_start_main+0xf6) [0x3bd8ff35f66] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:07.000 [warn] Bug: /usr/bin/tor(_start+0x2a) [0x3d56b05b2a] (on Tor 0.3.2.6-alpha 87012d076ef58bb9)
Mmm dd hh:03:40.000 [notice] Tried for 120 seconds to get a connection to [scrubbed]:port. Giving up. (waiting for rendezvous desc)
```
Don't know if suspend+resume is necessary to trigger the bug.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24609consdiff implementation in Rust2020-06-13T15:18:50ZSebastian Hahnconsdiff implementation in Rustin my public repo in branch rust4, there's a pretty much complete consdiff implementation in Rust (only missing some logging and testing from the C side iirc). I won't have time to pick it up anytime soon I'm afraid but I hope someone fi...in my public repo in branch rust4, there's a pretty much complete consdiff implementation in Rust (only missing some logging and testing from the C side iirc). I won't have time to pick it up anytime soon I'm afraid but I hope someone finds it useful. Note it looks a bit different compared to the C code as we were trying very hard to come up with something without any unsafe code and no external dependencies, as this was some of the first rust code ever written for tor. It should be straight-forward, though.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24604Decorate IPv6 addresses in connection_t->address to avoid ambiguity2020-06-13T15:18:48ZteorDecorate IPv6 addresses in connection_t->address to avoid ambiguityCurrently, connection_t->address can be in one of three formats:
* hostname: www.example.com
* IPv4: 1.1.1.1
* IPv6: 2003::0001
Tor often uses this address with a port like this:
* hostname: www.example.com:1234
* IPv4: 1.1.1.1:1234
* I...Currently, connection_t->address can be in one of three formats:
* hostname: www.example.com
* IPv4: 1.1.1.1
* IPv6: 2003::0001
Tor often uses this address with a port like this:
* hostname: www.example.com:1234
* IPv4: 1.1.1.1:1234
* IPv6: 2003::0001:1234
The IPv6 case is ambiguous, and we should fix it.
One way of fixing it is to provide a flag if address is an IPv6 literal, and a function to format address and port. (Unfortunately, we can't always decorate IPv6 addresses, because that would cause bugs in other code and in controllers.)
Then we would need to go through every instance of `conn->address` and find the ones that use a port. This may also require a spec update, like #24603.Tor: unspecifiedhttps://gitlab.torproject.org/legacy/trac/-/issues/24594Protocol warning: Expiring stuck OR connection to fd...2020-06-13T15:18:46ZDavid Gouletdgoulet@torproject.orgProtocol warning: Expiring stuck OR connection to fd...So in theory, this is at protocol warning so shouldn't too problematic but I think this worth looking at it. I've been seeing many of these on a test relay I have (capped at 200KB/s) using KIST scheduler: (redacting the relay addr/port):...So in theory, this is at protocol warning so shouldn't too problematic but I think this worth looking at it. I've been seeing many of these on a test relay I have (capped at 200KB/s) using KIST scheduler: (redacting the relay addr/port):
```
Expiring stuck OR connection to fd 380 (IP:PORT). (3747888 bytes to flush; 3000 seconds since last write)
```
This is pretty big, 3.7MB stuck in the `outbuf` of a connection. The `3000` seconds since last write means that `connection_handle_write_impl()` hasn't been called which is *very* surprising in the first place.
There are currently two ways for the handle write function to be called, either through the libevent `write_event` which is fired everytime the socket is *ready* to write (see this as `POLLLOUT` from poll()). Or, it is directly called from KIST scheduler when cells are put in the outbuf.
This is worrying because it means that KIST did in fact put 3.7MB of cells on the outbuf thinking the socket had its TCP buffer stable enough to put that data in but somehow none got written on the socket.
On possibility is that KIST flushed cells on the connection then tried to write it to the network, that didn't work, the TCP information of the socket is still intact and because KIST doesn't check for errors (#24449), nothing happened. Then, somehow, after those 3.7MB were put in the outbuf, the channel was never scheduled again for a write because KIST had no idea that anything was left in the outbuf from previous flush on the network.
So then it comes down to the `write_event` to write those cells flushed by KIST. Without having a `POLLOUT` event on the socket, nothing will happen so the question I have is how can this event was never fired up for 50 minutes? I kind of feel that the TCP timeout would have kicked in by then if there was really a problem... ? But also, that is a _long_ time for an idle connection?Tor: unspecified