Tor issueshttps://gitlab.torproject.org/tpo/core/tor/-/issues2022-02-28T19:41:05Zhttps://gitlab.torproject.org/tpo/core/tor/-/issues/33375Stop advertising an IPv6 exit policy when DNS is broken for IPv62022-02-28T19:41:05ZteorStop advertising an IPv6 exit policy when DNS is broken for IPv6When `dns_seems_to_be_broken_for_ipv6()`, exits should stop advertising an IPv6 exit policy.
Here's a rough design:
* when `dns_seems_to_be_broken_for_ipv6()` is first set to 1, mark the relay descriptor dirty
* when rebuilding the desc...When `dns_seems_to_be_broken_for_ipv6()`, exits should stop advertising an IPv6 exit policy.
Here's a rough design:
* when `dns_seems_to_be_broken_for_ipv6()` is first set to 1, mark the relay descriptor dirty
* when rebuilding the descriptor, check `dns_seems_to_be_broken_for_ipv6()` before including an IPv6 exit policy
* reset `dns_seems_to_be_broken_for_ipv6()` periodically, maybe every 1-3 days?https://gitlab.torproject.org/tpo/core/tor/-/issues/33129Tor node that is not part of the consensus should not be used as rendezvous p...2022-10-24T20:53:07ZcypherpunksTor node that is not part of the consensus should not be used as rendezvous point with the onion serviceAccording to this article attacker is able to to chose a server that is running Tor but is not part of the Tor network as an rendezvous point with the onion service so that he can discover in to which family onion service`s guard node be...According to this article attacker is able to to chose a server that is running Tor but is not part of the Tor network as an rendezvous point with the onion service so that he can discover in to which family onion service`s guard node belongs and than use that information to ddos Tor nodes in that family so that onion service drops that guard node and instead chose his Tor node as a guard node.
https://www.hackerfactor.com/blog/index.php?/archives/868-Deanonymizing-Tor-Circuits.htmlhttps://gitlab.torproject.org/tpo/core/tor/-/issues/32863Put an up-to-date web version of the Tor manual somewhere2022-06-10T19:45:23ZRoger DingledinePut an up-to-date web version of the Tor manual somewhereWe used to publish
https://2019.www.torproject.org/docs/tor-manual.html.en
but that version of the Tor manual is now obsolete and for archival purposes.
I guess in the glorious future we will have a dev.torproject.org/tor/ that lists th...We used to publish
https://2019.www.torproject.org/docs/tor-manual.html.en
but that version of the Tor manual is now obsolete and for archival purposes.
I guess in the glorious future we will have a dev.torproject.org/tor/ that lists things like the Tor manual. Or maybe it should be on support.torproject.org because it is a resource for users (albeit advanced ones), not developers. Maybe it *is* on support.tpo right now, and I just couldn't find it?
In any case, here is a ticket for making progress on actually getting a web version of the manual onto a web page somewhere, ideally automatically kept up to date.
(Once it exists, there is a good argument for adding another htaccess rule to the 2019.www site, redirecting from the man page that nobody should look at it to the one that people should.)
This ticket motivated by discussion in legacy/trac#28526.Developer portalhttps://gitlab.torproject.org/tpo/core/tor/-/issues/32835regression: log subsystem not closing files on switch back to syslog2022-09-01T21:42:49Zstarlightregression: log subsystem not closing files on switch back to syslog```
SETCONF Log="debug file logfile001"
SETCONF Log="notice syslog"
$ fuser logfile001
logfile001: 31388
```
Did behave correctly in earlier versions.```
SETCONF Log="debug file logfile001"
SETCONF Log="notice syslog"
$ fuser logfile001
logfile001: 31388
```
Did behave correctly in earlier versions.https://gitlab.torproject.org/tpo/core/tor/-/issues/32816Move the relay options into the correct man page sections2021-09-16T14:11:41ZteorMove the relay options into the correct man page sectionsSome of the relay options aren't under Server Options or Directory Server Options in the man page. We should move them, so it's clear which options are disabled when relay mode is disabled.
Here is the full list of options that we shoul...Some of the relay options aren't under Server Options or Directory Server Options in the man page. We should move them, so it's clear which options are disabled when relay mode is disabled.
Here is the full list of options that we should check:
* DirPort, DirCache, ORPort
* --list-fingerprint, RelayBandwidth*, MaxAdvertisedBandwidth, PerConnBW*, and ServerTransportPluginhttps://gitlab.torproject.org/tpo/core/tor/-/issues/32729"hs_circuitmap_init: Assertion !the_hs_circuitmap failed" when repeating fail...2023-03-23T09:17:08Zeighthave"hs_circuitmap_init: Assertion !the_hs_circuitmap failed" when repeating failed tests on AndroidWhen repeated running the Android `TorService` test suite from https://gitlab.com/eighthave/tor-android, I occasionally received this crash:
```
main src/feature/hs/hs_circuitmap.c:598: hs_circuitmap_init: Assertion !the_hs_circu...When repeated running the Android `TorService` test suite from https://gitlab.com/eighthave/tor-android, I occasionally received this crash:
```
main src/feature/hs/hs_circuitmap.c:598: hs_circuitmap_init: Assertion !the_hs_circuitmap failed; aborting.
TorService Bug: Tor 0.4.2.5 (git-7e55ab23311d00b6): Assertion !the_hs_circuitmap failed in hs_circuitmap_init at src/feature/hs/hs_circuitmap.c:598: . (Stack trace not available) (on Tor 0.4.2.5 7e55ab23311d00b6)
libc Fatal signal 6 (SIGABRT), code -6 in tid 862 (tor), pid 836 (oid.binary.test)
TorService ⇠ run [1425ms]
DEBUG #00 pc 00000af0 [vdso:f319a000] (__kernel_vsyscall+16)
#01 pc 0001edf8 /system/lib/libc.so (syscall+40)
#02 pc 0001f073 /system/lib/libc.so (abort+115)
#03 pc 0025ab6e /data/app/org.torproject.android.binary.test-FWLW5957oXEttHq4Pbyz4w==/lib/x86/libtor.so (tor_raw_abort_+31)
#04 pc 0025673e /data/app/org.torproject.android.binary.test-FWLW5957oXEttHq4Pbyz4w==/lib/x86/libtor.so (tor_abort_+31)
#05 pc 0014df1d /data/app/org.torproject.android.binary.test-FWLW5957oXEttHq4Pbyz4w==/lib/x86/libtor.so (hs_circuitmap_init+146)
#06 pc 00154110 /data/app/org.torproject.android.binary.test-FWLW5957oXEttHq4Pbyz4w==/lib/x86/libtor.so (hs_init+39)
#07 pc 000b17ac /data/app/org.torproject.android.binary.test-FWLW5957oXEttHq4Pbyz4w==/lib/x86/libtor.so (tor_init+132)
#08 pc 000b2033 /data/app/org.torproject.android.binary.test-FWLW5957oXEttHq4Pbyz4w==/lib/x86/libtor.so (tor_run_main+243)
#09 pc 000b0933 /data/app/org.torproject.android.binary.test-FWLW5957oXEttHq4Pbyz4w==/lib/x86/libtor.so (Java_org_torproject_jni_TorService_runMain+59)
#10 pc 000085c8 /data/app/org.torproject.android.binary.test-FWLW5957oXEttHq4Pbyz4w==/oat/x86/base.odex (offset 0x8000)
#11 pc 000091ff [anon:libc_malloc:e6980000]
#12 pc 01114457 /dev/ashmem/dalvik-main space (region space) (deleted)
#13 pc 80e30d65 <unknown>
```
* I think this only happened if Tor quit the test before due to bad command line args
* it could be that the Android Test Orchestrator is reusing state that it shouldn't be https://developer.android.com/training/testing/junit-runner#using-android-test-orchestrator
* "The Dalvik process hosting a typical app is forked off of zygote with all the common android libraries already mapped, so new unique copies don't have to be opened" https://stackoverflow.com/questions/9153166/understanding-android-zygote-and-dalvikvm
nickm said:
> So there are at least two possibilities:
> 1) tor is busted and is calling hs_circuitmap_init() more than it should
> 2) tor is busted and is calling hs_circuitmap_free_all() less than it should there are probably more, like:
> 3) there is a problem in the android harness, and it is trying to restart tor before it has shut downTor: 0.4.4.x-finalAlexander Færøyahf@torproject.orgAlexander Færøyahf@torproject.orghttps://gitlab.torproject.org/tpo/core/tor/-/issues/32691Image broken in 'src-ref' documentation2022-02-07T19:38:32ZoparaImage broken in 'src-ref' documentationFor example if you visit https://src-ref.docs.torproject.org/tor/dataflow.html, the "structure hierarchy for connection types" image is missing (the img tag has a 404). There are possibly other missing images as well, but I can't find th...For example if you visit https://src-ref.docs.torproject.org/tor/dataflow.html, the "structure hierarchy for connection types" image is missing (the img tag has a 404). There are possibly other missing images as well, but I can't find the markdown files to check (there were the original versions, then they were moved to the tor git repo and edited iirc, but now they're gone).
There's also another image missing on the same page, but has no <img> tag (compare the top of the https://people.torproject.org/~nickm/tor-auto/internal/02-dataflow.html and https://src-ref.docs.torproject.org/tor/dataflow.html pages). But it may have been removed on purpose.https://gitlab.torproject.org/tpo/core/tor/-/issues/32388sched: When OR connection opens, always indicate the channel is ready for cells2022-09-01T21:42:49ZDavid Gouletdgoulet@torproject.orgsched: When OR connection opens, always indicate the channel is ready for cells#### Problem
In `channel_tls_handle_state_change_on_orconn()`, when called when the OR connection becomes open, we have this snippet of code for when the new state is OPEN:
```
channel_change_state_open(base_chan);
if (connecti...#### Problem
In `channel_tls_handle_state_change_on_orconn()`, when called when the OR connection becomes open, we have this snippet of code for when the new state is OPEN:
```
channel_change_state_open(base_chan);
if (connection_or_num_cells_writeable(conn) > 0) {
scheduler_channel_wants_writes(base_chan);
}
```
So basically, the above means we only indicate the scheduler that the channel wants to write if we already have cells on the `outbuf`.
It basically means that if the channel *scheduler* state is `IDLE` (initial opening state), it then ends up in state `SCHED_CHAN_WAITING_FOR_CELLS` which then means the scheduler will process the channel when a cell is queued on it. But ONLY if the channel had cells when it was opened.
This on its own, as a design, is problematic because then what can make the channel transition into a state that allows the scheduler to recognize the channel as ready to be processed for cells? (`SCHED_CHAN_WAITING_FOR_CELLS`).
Tor currently has 2 callsites that tells the scheduler that a channel "wants to write" data on the wire (`scheduler_channel_wants_writes()`, remember that function transition the scheduler state of the channel to "waiting for cells"):
1. It is mentioned above that is when the channel becomes `OPEN`.
2. The other one is when data is _flushed_ from the outbuf, in `connection_or_flushed_some()`.
So once the channel is opened, we only become in the "wants to write" state if we previously had cells in the outbuf, which I can assure you is not always the case. And the other way is when we just wrote bytes on the network but then how can we do that in the first place?
#### What Is Saving Us
One question is then: Maybe tor code flow makes it that we always have a cell in the outbuf when the channel opens?
I conducted an experiment which made an Entry node of a client to only send 1 cell at a time per mainloop round. This made it that the `VERSIONs`, `CERTS` and `NETINFO` cell were sent in 3 different mainloop round and thus the client received them with a "delay".
That was enough for the client's channel to have _nothing_ in the outbuf when the channel became `OPEN` that is when the `NETINFO` cell is received from the relay which skipped the scheduler state change and thus the client channel was stuck there in scheduler `IDLE` mode even though the channel was in `OPEN` state.
What is saving us is because the very first thing we write on the wire, `VERSIONS` cell, calls the (2) callsite that tells the scheduler to go in "waiting for cells" state. And from there, the channel stays in that state.
For now, this seems to be "fine" but any future changes like for instance where we wanted to have everything that writes on the `outbuf` to go through the scheduler so we can have proper KIST prioritization, will fail due to this design.
#### Short-Term Solution
When the channel opens, we should simply put it in `wants to write` state all the time. So even bouncing from `MAINT` to `OPEN` state and vice versa will never make some cells stuck in the channel until something is explicitly written on the outbuf.
Furthermore, it should be done in `channel_change_state_()` since this affects `channel_t`, it is NOT specific to `channeltls_t`. So in a world where we end up with multiple channel type, this whole situation explodes.https://gitlab.torproject.org/tpo/core/tor/-/issues/32340[Android] FAIL src/test/test_process_slow.c:240: assert(smartlist_len(process...2023-01-31T15:37:26Zeighthave[Android] FAIL src/test/test_process_slow.c:240: assert(smartlist_len(process_data->stdout_data) OP_EQ 12): 2 vs 12Running the _test-slow_ suite fails every time with this error on 0.4.1.x, 0.4.2.x, and master:
```
slow/process/callbacks:
FAIL src/test/test_process_slow.c:240: assert(smartlist_len(process_data->stdout_data) OP_EQ 12): 2 vs 12
[...Running the _test-slow_ suite fails every time with this error on 0.4.1.x, 0.4.2.x, and master:
```
slow/process/callbacks:
FAIL src/test/test_process_slow.c:240: assert(smartlist_len(process_data->stdout_data) OP_EQ 12): 2 vs 12
[callbacks FAILED]
```
The full build/test log is here:
https://gitlab.com/eighthave/tor/-/jobs/336701282
To try this yourself, it'll be easiest using the Guardian Project fork, which is in sync on GitHub and GitLab:
* https://github.com/guardianproject/tor
* https://gitlab.com/guardianproject/tor
I looked into it a bit, but couldn't quite figure out what was being tested there. Is it Tor's "smartlist" functions? stdin/stdout/stderr have annoying, arbitrary restrictions in Android, so it could be related to that. For example, "native" code loaded from a shared library will have stdin/stdout/stderr redirected to /dev/null, while in Java code, it is reachable. There is a related example on legacy/trac#32036https://gitlab.torproject.org/tpo/core/tor/-/issues/32334do not throw enable_tool_name_check error when compiling for Android and iOS2022-09-01T21:42:49Zeighthavedo not throw enable_tool_name_check error when compiling for Android and iOSThis check only fails when building for Android and iOS, it does not do something useful there. So when compiling for Android or iOS, it should be disabled by default. Or perhaps even better: it should only be enabled on platforms wher...This check only fails when building for Android and iOS, it does not do something useful there. So when compiling for Android or iOS, it should be disabled by default. Or perhaps even better: it should only be enabled on platforms where it is useful. The Guardian Project, Thali, and Briar Android builds have always used --disable-name-tool-check. The iCepa Tor.framework also does.
Some examples:
* LarryBitcoin: https://trac.torproject.org/projects/tor/ticket/31552#comment:4
* n8fr8: https://trac.torproject.org/projects/tor/ticket/31301#comment:2
* iCepa Tor.framework: https://github.com/iCepa/Tor.framework/blob/30480101b3bd2166baced913347fc8bb7392b3da/Tor/tor.sh#L139
* tor-android: https://github.com/guardianproject/tor-android/blob/0830c87a29ea197ecd9a0b5aee7d9a4fb88dce25/external/Makefile#L271https://gitlab.torproject.org/tpo/core/tor/-/issues/32165On first boot, Tor mistakenly tells me "The current consensus has no exit nodes"2021-02-17T09:23:04ZRoger DingledineOn first boot, Tor mistakenly tells me "The current consensus has no exit nodes"Starting up 0.4.3.0-alpha-dev (git-71daad1692bc3f24) without any cached-* files in my DataDirectory, I get:
```
Oct 20 04:44:56.026 [notice] Bootstrapped 30% (loading_status): Loading networkstatus consensus
Oct 20 04:44:56.636 [notice] ...Starting up 0.4.3.0-alpha-dev (git-71daad1692bc3f24) without any cached-* files in my DataDirectory, I get:
```
Oct 20 04:44:56.026 [notice] Bootstrapped 30% (loading_status): Loading networkstatus consensus
Oct 20 04:44:56.636 [notice] I learned some more directory information, but not enough to build a circuit: We have no usable consensus.
Oct 20 04:44:56.758 [notice] Bootstrapped 40% (loading_keys): Loading authority key certs
Oct 20 04:44:56.936 [notice] The current consensus has no exit nodes. Tor can only build internal paths, such as paths to onion services.
Oct 20 04:44:56.936 [notice] Bootstrapped 45% (requesting_descriptors): Asking for relay descriptors
Oct 20 04:44:56.936 [notice] I learned some more directory information, but not enough to build a circuit: We need more microdescriptors: we have 0/5841, and can only build 0% of likely paths. (We have 0% of guards bw, 0% of midpoint bw, and 0% of end bw (no exits in consensus, using mid) = 0% of path bw.)
Oct 20 04:44:57.337 [notice] Bootstrapped 50% (loading_descriptors): Loading relay descriptors
Oct 20 04:44:57.592 [notice] The current consensus contains exit nodes. Tor can build exit and internal paths.
Oct 20 04:44:58.178 [notice] Bootstrapped 58% (loading_descriptors): Loading relay descriptors
```
It's that "The current consensus has no exit nodes." line that is out of place.https://gitlab.torproject.org/tpo/core/tor/-/issues/32103Subsystem "thread_cleanup" is never called2021-11-06T13:03:43ZoparaSubsystem "thread_cleanup" is never calledSubsystems implement the interface of `struct subsys_fns_t`, with one of the optional function pointers being `void (*thread_cleanup)(void)`. This `thread_cleanup` function is called for all subsystems by the subsystem manager function `...Subsystems implement the interface of `struct subsys_fns_t`, with one of the optional function pointers being `void (*thread_cleanup)(void)`. This `thread_cleanup` function is called for all subsystems by the subsystem manager function `void subsystems_thread_cleanup(void)`, but the `subsystems_thread_cleanup` function is never called anywhere in the code.
At the moment, the only subsystem to implement the `thread_cleanup` interface is the crypto subsystem, which uses `thread_cleanup` for freeing the threadlocal `crypto_fast_rng_t`, as well as freeing the threadlocal error queue on old versions of OpenSSL. As far as I can tell, this is never run.
I think that the `subsystems_thread_cleanup` function should be run somewhere in the code, but it's not clear to me how this `subsystems_thread_cleanup` is expected to be used. It seems like there should also be `subsystems_thread_init` and `thread_init` functions as well for initializing threadlocal variables. Right now the crypto subsystem does an ["initialize on first use" singleton pattern](https://github.com/torproject/tor/blob/dfe7f004df46edaab0b23e260218d3c6d422d5fe/src/lib/crypt_ops/crypto_rand_fast.c#L382), but it might be useful to add this initialization interface function so that subsystems have the option of initializing all of their threadlocals in one place.https://gitlab.torproject.org/tpo/core/tor/-/issues/31653Padding cells sent with 0ms delay cause circuit failures2023-06-09T13:27:09ZpullsPadding cells sent with 0ms delay cause circuit failuresBelow appears to be a bug that results in a closed circuit due to a relay sending a padding cell (RELAY_COMMAND_DROP) to the client.
At links below you find code for two circuit padding machines:
1. circpad_machine_client_close_circuit_...Below appears to be a bug that results in a closed circuit due to a relay sending a padding cell (RELAY_COMMAND_DROP) to the client.
At links below you find code for two circuit padding machines:
1. circpad_machine_client_close_circuit_minimal
2. circpad_machine_relay_close_circuit_minimal
Machine 1 runs on a client on CIRCUIT_PURPOSE_C_GENERAL circuits with 3 hops as soon as CIRCPAD_CIRC_OPENED: the only thing it does is set a timer 1000s in the future and waits for the timer to expire. The purpose of the machine is to negotiate padding with the relay to activate Machine 2 on the middle relay.
Machine 2 runs at the middle relay and repeatedly sends a padding cell to the client 1 usec after the event CIRCPAD_EVENT_NONPADDING_SENT. In other words, for every non-padding cell we directly add a padding cell. At the client, this causes `relay_decrypt_cell(): Incoming cell at client not recognized. Closing.` for all circuits triggering Machine 2 at the relay. Closing a circuit by injecting padding cells seems unintended.
Here is part of a log from a client on info level showing how the machine registers, negotiates with the relay (starting Machine 2 at the relay), eventually fails to decrypt, and the circuit closes.
```
Sep 05 10:32:19.000 [info] circpad_setup_machine_on_circ(): Registering machine client_close_circuit_minimal to origin circ 3 (5)
Sep 05 10:32:19.000 [info] circpad_node_supports_padding(): Checking padding: supported
Sep 05 10:32:19.000 [info] circpad_negotiate_padding(): Negotiating padding on circuit 3 (5), command 2
Sep 05 10:32:19.000 [info] link_apconn_to_circ(): Looks like completed circuit to [scrubbed] does allow optimistic data for connection to [scrubbed]
Sep 05 10:32:19.000 [info] connection_ap_handshake_send_begin(): Sending relay cell 0 on circ 3296464733 to begin stream 20575.
Sep 05 10:32:19.000 [info] connection_ap_handshake_send_begin(): Address/port sent, ap socket 13, n_circ_id 3296464733
Sep 05 10:32:19.000 [info] rep_hist_note_used_port(): New port prediction added. Will continue predictive circ building for 2355 more seconds.
Sep 05 10:32:19.000 [info] connection_edge_process_inbuf(): data from edge while in 'waiting for circuit' state. Leaving it on buffer.
Sep 05 10:32:19.000 [info] exit circ (length 3): $[manually-scrubbed](open) $[manually-scrubbed](open) $[manually-scrubbed](open)
Sep 05 10:32:19.000 [info] pathbias_count_use_attempt(): Used circuit 3 is already in path state use attempted. Circuit is a General-purpose client currently open.
Sep 05 10:32:19.000 [info] link_apconn_to_circ(): Looks like completed circuit to [scrubbed] does allow optimistic data for connection to [scrubbed]
Sep 05 10:32:19.000 [info] connection_ap_handshake_send_begin(): Sending relay cell 0 on circ 3296464733 to begin stream 20576.
Sep 05 10:32:19.000 [info] connection_ap_handshake_send_begin(): Address/port sent, ap socket 14, n_circ_id 3296464733
Sep 05 10:32:19.000 [info] circpad_deliver_recognized_relay_cell_events(): Got padding cell on origin circuit 3.
Sep 05 10:32:20.000 [info] relay_decrypt_cell(): Incoming cell at client not recognized. Closing.
Sep 05 10:32:20.000 [info] circuit_receive_relay_cell(): relay crypt failed. Dropping connection.
Sep 05 10:32:20.000 [info] command_process_relay_cell(): circuit_receive_relay_cell (backward) failed. Closing.
Sep 05 10:32:20.000 [info] circpad_circuit_machineinfo_free_idx(): Freeing padding info idx 0 on circuit 3 (23)
Sep 05 10:32:20.000 [info] circpad_node_supports_padding(): Checking padding: supported
Sep 05 10:32:20.000 [info] circpad_negotiate_padding(): Negotiating padding on circuit 3 (23), command 1
Sep 05 10:32:20.000 [info] pathbias_send_usable_probe(): Sending pathbias testing cell to 0.233.9.53:25 on stream 20577 for circ 3.
Sep 05 10:32:20.000 [info] relay_decrypt_cell(): Incoming cell at client not recognized. Closing.
Sep 05 10:32:20.000 [info] circuit_receive_relay_cell(): relay crypt failed. Dropping connection.
Sep 05 10:32:20.000 [info] command_process_relay_cell(): circuit_receive_relay_cell (backward) failed. Closing.
Sep 05 10:32:20.000 [info] pathbias_send_usable_probe(): Got pathbias probe request for circuit 3 with outstanding probe
Sep 05 10:32:20.000 [info] pathbias_check_close(): Circuit 3 closed without successful use for reason 2. Circuit purpose 23 currently 1,open. Len 3.
Sep 05 10:32:20.000 [info] circuit_mark_for_close_(): Circuit 3296464733 (id: 3) marked for close at src/core/or/command.c:582 (orig reason: 2, new reason: 0)
Sep 05 10:32:20.000 [info] connection_edge_destroy(): CircID 0: At an edge. Marking connection for close.
Sep 05 10:32:20.000 [info] connection_edge_destroy(): CircID 0: At an edge. Marking connection for close.
Sep 05 10:32:20.000 [info] circuit_free_(): Circuit 0 (id: 3) has been freed.
```
If we delay sending the padding from the relay I cannot trigger the bug (see commented out fix in the Machine 2 function). With the code below, the code triggers on every circuit that has the machine negotiated.
Code: https://github.com/pylls/tor/tree/circuit-padding-relay-padding-bug (https://github.com/pylls/tor/blob/circuit-padding-relay-padding-bug/src/core/or/circuitpadding_machines.c#L460, as well as circuitpadding_machines.h and registration in circpad_machines_init() of circuitpadding.c).Mike PerryMike Perryhttps://gitlab.torproject.org/tpo/core/tor/-/issues/31632hs-v3: Service doesn't re-upload descriptor on circuit failure2021-06-23T17:19:23ZDavid Gouletdgoulet@torproject.orghs-v3: Service doesn't re-upload descriptor on circuit failureI'm observing, quite often actually, a service posting its descriptor to an HSDir but the circuit collapses due to remote reason `CHANNEL_CLOSED`.
This is possible for many reasons where a link between two relays failed/disconnected/clo...I'm observing, quite often actually, a service posting its descriptor to an HSDir but the circuit collapses due to remote reason `CHANNEL_CLOSED`.
This is possible for many reasons where a link between two relays failed/disconnected/closed/...
However, we do not retry the upload after that which means that we can end up with HSDir(s) without our descriptor even though we think they are there.
Solution is unclear but it appears that we probably want to hook this case into `hs_circ_cleanup()` which is called from the mark for close function.https://gitlab.torproject.org/tpo/core/tor/-/issues/31364tor_bug_occurred_(): Bug: ../src/feature/nodelist/microdesc.c:494: warn_if_nu...2023-07-16T12:12:44ZTractor_bug_occurred_(): Bug: ../src/feature/nodelist/microdesc.c:494: warn_if_nul_found: Non-fatal assertion !(nul_found) failed. (on Tor 0.4.0.5) [also on 0.4.5.8]
```
Aug 07 11:00:15.000 [notice] Tor 0.4.0.5 opening log file.
Aug 07 11:00:15.393 [warn] OpenSSL version from headers does not match the version we're running with. If you get weird crashes, that might be why. (Compiled with 101000af: ...
```
Aug 07 11:00:15.000 [notice] Tor 0.4.0.5 opening log file.
Aug 07 11:00:15.393 [warn] OpenSSL version from headers does not match the version we're running with. If you get weird crashes, that might be why. (Compiled with 101000af: OpenSSL 1.1.0j 20 Nov 2018; running with 101000bf: OpenSSL 1.1.0k 28 May 2019).
Aug 07 11:00:15.395 [notice] Can't get entropy from getrandom(). You are running a version of Tor built to support getrandom(), but the kernel doesn't implement this function--probably because it is too old? Trying fallback method instead.
Aug 07 11:00:15.398 [notice] Tor 0.4.0.5 running on Linux with Libevent 2.0.21-stable, OpenSSL 1.1.0k, Zlib 1.2.11, Liblzma 5.2.4, and Libzstd 1.3.8.
Aug 07 11:00:15.398 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download
Aug 07 11:00:15.398 [warn] Tor was compiled with zstd 1.1.2, but is running with zstd 1.3.8. For safety, we'll avoid using advanced zstd functionality.
Aug 07 11:00:15.399 [notice] Read configuration file "/usr/share/tor/tor-service-defaults-torrc".
Aug 07 11:00:15.399 [notice] Read configuration file "/etc/tor/torrc".
Aug 07 11:00:15.407 [notice] Based on detected system memory, MaxMemInQueues is set to 384 MB. You can override this by setting MaxMemInQueues by hand.
Aug 07 11:00:15.410 [notice] I think we have 40 CPUS, but only 1 of them are available. Telling Tor to only use 1. You can override this with the NumCPUs option
Aug 07 11:00:15.411 [notice] Opening OR listener on 0.0.0.0:443
Aug 07 11:00:15.411 [notice] Opened OR listener on 0.0.0.0:443
Aug 07 11:00:15.411 [notice] Opening OR listener on [***]:443
Aug 07 11:00:15.412 [notice] Opened OR listener on [***]:443
Aug 07 11:00:15.412 [notice] Opening Directory listener on 0.0.0.0:9030
Aug 07 11:00:15.412 [notice] Opened Directory listener on 0.0.0.0:9030
Aug 07 11:00:15.000 [notice] Not disabling debugger attaching for unprivileged users.
Aug 07 11:00:15.000 [warn] Found empty file "1037" in consensus cache; removing it.
Aug 07 11:00:15.000 [warn] Unable to map file (null) from consensus cache: No such file or directory
Aug 07 11:00:16.000 [notice] Parsing GEOIP IPv4 file /usr/share/tor/geoip.
Aug 07 11:00:17.000 [notice] Parsing GEOIP IPv6 file /usr/share/tor/geoip6.
Aug 07 11:00:17.000 [notice] Configured to measure statistics. Look for the *-stats files that will first be written to the data directory in 24 hours from now.
Aug 07 11:00:17.000 [notice] Your Tor server's identity key fingerprint is '***'
Aug 07 11:00:17.000 [notice] Bootstrapped 0% (starting): Starting
Aug 07 11:00:17.000 [warn] tor_bug_occurred_(): Bug: ../src/feature/nodelist/microdesc.c:494: warn_if_nul_found: Non-fatal assertion !(nul_found) failed. (on Tor 0.4.0.5)
Aug 07 11:00:17.000 [warn] Bug: Non-fatal assertion !(nul_found) failed in warn_if_nul_found at ../src/feature/nodelist/microdesc.c:494. Stack trace: (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(log_backtrace_impl+0x47) [0x55e3b15b98e7] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(tor_bug_occurred_+0xc0) [0x55e3b15b4db0] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(+0x11e43f) [0x55e3b14d643f] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(microdesc_cache_reload+0xce) [0x55e3b14d89ee] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(get_microdesc_cache+0x48) [0x55e3b14d8ad8] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(nodelist_set_consensus+0x3fd) [0x55e3b14e4e3d] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(networkstatus_set_current_consensus+0x927) [0x55e3b14dd9d7] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(+0x125dfe) [0x55e3b14dddfe] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(router_reload_consensus_networkstatus+0x45) [0x55e3b14ddeb5] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(run_tor_main_loop+0xec) [0x55e3b141564c] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(tor_run_main+0x11e5) [0x55e3b1416b05] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(tor_main+0x3a) [0x55e3b1413c8a] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(main+0x19) [0x55e3b1413809] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1) [0x7f10090512e1] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] Bug: /usr/bin/tor(_start+0x2a) [0x55e3b141385a] (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] warn_if_nul_found(): Bug: Found unexpected NUL while reading microdesc journal, offset 0at position 295945/331210. (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] warn_if_nul_found(): Bug: surrounding string: 595630544A522B5159546D49496A300A00000000000000000000000000000000 (on Tor 0.4.0.5 )
Aug 07 11:00:17.000 [warn] parse error: internal NUL character.
Aug 07 11:00:17.000 [warn] Unparseable microdescriptor found in journal
Aug 07 11:00:27.000 [notice] Starting with guard context "default"
Aug 07 11:00:27.000 [notice] Signaled readiness to systemd
Aug 07 11:00:27.000 [notice] Bootstrapped 5% (conn): Connecting to a relay
```
Apart from that error message Tor seems to work fine.
The relay was running on a VM and when i saw that error the whole VM behaved strange. I think the ISP corrupted something so maybe that error isnt Tor's fault.
**Trac**:
**Username**: computer_freakTor: 0.4.7.x-freezehttps://gitlab.torproject.org/tpo/core/tor/-/issues/31340Tor sometimes stalls traffic at 2s intervals (6s most common)2022-06-24T16:03:24ZMike PerryTor sometimes stalls traffic at 2s intervals (6s most common)Dennis Jackson found some very suspicious performance issues while scanning the Tor network:
https://lists.torproject.org/pipermail/tor-scaling/2019-July/000063.html
The "6s delay" mystery there looks like some kind of relay bug. There ...Dennis Jackson found some very suspicious performance issues while scanning the Tor network:
https://lists.torproject.org/pipermail/tor-scaling/2019-July/000063.html
The "6s delay" mystery there looks like some kind of relay bug. There are smaller bumps for delays at 2s multiples.. Cause is still unknown. We should track this one down and fix it.https://gitlab.torproject.org/tpo/core/tor/-/issues/31183Situational symlink attacks on ControlPortWriteToFile etc.2022-09-01T21:39:47ZGeorge KadianakisSituational symlink attacks on ControlPortWriteToFile etc.Here is a bug report from paldium on hackerone. It's basically a very situational and restricted local priviledge escalation against certain setups and threat models:
```
# Summary
It is possible to change permissions of files through ...Here is a bug report from paldium on hackerone. It's basically a very situational and restricted local priviledge escalation against certain setups and threat models:
```
# Summary
It is possible to change permissions of files through tor or
gain write access to newly created ones. The target file can be
chosen by a local attacker if an adjusted configuration is used.
#How to reproduce
- Given is a FreeBSD or Mac OS X system.
- Tor is configured with a torrc file containing this line:
ControlPortWriteToFile /tmp/control.txt
- Optionally (race condition) this line is included as well:
ControlPortFileGroupReadable 1
- /tmp is a directory with sticky bit, i.e. trwxrwxrwx
##or
- Given is a Unix-like (Linux) system.
- Tor is configured with a torrc file containing this line:
ControlPortWriteToFile /tmp/proof/control.txt
- Optionally (race condition) this line is included as well:
ControlPortFileGroupReadable 1
- /tmp/proof is writable by the local attacker
When tor starts, then it will eventually write all available control
ports into file configured by "ControlPortWriteToFile". This file is not
written directly into, but a temporary file is used (the extension
".tmp" is added to file name). If this file already exists, then it is
simply truncated.
See src/lib/fs/files.c start_writing_to_file for details, especially
line 321 and following.
The problem is that an attacker can simply create the temporary file
before tor gets the chance to. On Mac this attack works by default
against /tmp, but Linux has a protection against symlink attacks on
directories with sticky bit like /tmp or /var/tmp. Therefore it takes an
unusual configuration on Linux or a different (regular) directory.
Let the attacker create a file which tor, even with dropped privileges,
can write to:
`attacker$ install -m 666 /dev/null /tmp/control.txt.tmp`
The tor process will use this file without adjusting the permissions,
because O_CREAT was not actually performed. Instead, O_TRUNC simply
truncated the file to remove existing content.
Afterwards tor will write content to this file, which the attacker could
simply override at this point. But there is no real need to, because
the target file will still keep the permissions of this temporary file
after rename.
See src/lib/fs/files.c replace_file for details. Basically it's a
simple rename() call which therefore removes the target inode and
renames the temporary file to the target one (in this setup).
At this point, the first attack is finished. The file /tmp/control.txt
is under control of the attacker. If "ControlPortFileGroupReadable 1"
is not given, this exploit code works on Linux systems as well (you
can skip the proof/ part on Mac due to lack of /tmp protection):
```In torrc: ControlPortWriteToFile /tmp/proof/control.txt
attacker$ install -Dm 666 /dev/null /tmp/proof/control.txt.tmp
root$ tor -f torrc
attacker$ ls -l /tmp/proof/control.txt
-rw-rw-rw- 1 attacker attacker 0 Jun 24 22:59 /tmp/proof/control.txt```
##Second attack
The second attack uses a race condition. Because /tmp/control.txt is
under control of the attacker, the file can be deleted and replaced with
a symbolic link to a target which we want to get group read permissions
for. This is a possible scenario if "ControlPortFileGroupReadable 1" is
given.
In this case, chmod() is called in tor process. See
src/feature/control/control.c control_ports_write_to_file for more
details, especially line 149 for more details (chmod).
The system call chmod() uses a file path as an argument. There is no
guarantee that the file path still refers to the file which has been
created by the tor process. Furthermore, chmod() follows symbolic
links, therefore the referenced target file is adjusted by chmod.
This is obviously a race condition but can be used to gain read access
to files which -- even by configuration -- should be private to the
tor user.
The attacker still needs group-read permissions, otherwise the newly
revealed files still cannot be accessed.
##Example
`tor just renamed controlled /tmp/control.txt.tmp to /tmp/control.txt`
`attacker$ rm /tmp/control.txt`
`attacker$ ln -s /var/lib/tor /tmp/control.txt`
`tor calls chmod on /tmp/control.txt and therefore on /var/lib/tor`
The attacker, if part of tor's group, can access /var/lib/tor now as well
(only read, but hierarchy is known through other tor installations)
#Solution
Use fchmod() on a file which has been opened with open() and O_NOFOLLOW
to prevent changing any files which have been reached through symbolic
links. Preventing symbolic links is already a great deal in file
src/lib/fs/dir.c check_private_dir, line 85. Could be implemented as
tor_chmod() to fix other chmod() calles in the code as well.
Use mkstemp() to atomically create a temporary file which has the
guaranteed permission 600 and owned by the tor user.
## Impact
A local attacker can modify the content of files which are considered trusted by dependent tools, e.g. the control port file.
Also a local attacker can extend privileges of files which are supposed to be private to the tor user and not readable by the group.
Timeline:
2019-06-25 10:24:59 +0000: @paldium (comment)
To clarify one aspesct about attack 1: Of course it is not a tor issue if someone creates a world writable directory and lets tor create files in there. Everyone could simply override the resulting file and tor will never be able to prevent that. Therefore, the rename()-issue is a non-issue on Linux. But as a user I would expect that temporary files within /tmp with sticky bits are properly handled.
But attack 2 (race condition) should even be prevented in a world writable directory (or writable by a possibly malicious user). Therefore I consider attack 2 on Linux to be unlikely but plausible.
I would like to add a patch to this, but I am not sure how you want to handle file permissions.
The nicest approach would be to supply the permissions to the function which creates the file, but that would be a no-op on Windows.
Otherwise tor_chmod could open() the file and at least prevent symbolic link attacks. Yet, files could still be modified which are not the ones we created.
As this is a design decision, I haven't started writing a patch because I do not know which one you would prefer.
```
Here is some further information:
```
ControlPortWriteFile is one example. The same attack scenarios are true for:
- ExtORPortCookieAuthFileGroupReadable 1
- CookieAuthFileGroupReadable 1
- DataDirectoryGroupReadable 1
- CacheDirectoryGroupReadable 1
- KeyDirectoryGroupReadable 1
```
Here is some of my analysis for attack scenarios:
```
Hello @paldium,
thanks for submitting this report.
Here is the best attacks we could find given the bugs you gave us:
Attack 1) The most realistic attack scenario I could think of is a system where the attacker is a local user who cannot establish outgoing connections, but is able to overwrite the ControlPortWriteToFile file, and replaces it with an attacker controlled IP:PORT, and then a controller program connects to the evil IP:PORT thereby deanonymizing the user. This seems to be a very artificial scenario, which assumes a particular threat model, and a tor with specific configuration parameters, and a very specific system,.
Attack 2) The most realistic attack scenario here would be the attacker using this "read anything controlled by Tor" race condition attack to learn the private keys of an onion service that are on the same system but not able to read them... I'm actually not sure if that would work, but I think it's possible. This also assumes a particular threat-model, a multi-user system, and specific configuration parameters.
@paldium, would it be possible to outline the various solutions we have for fixing this issue?I don't think specifying the permissions in the torrc is a nice thing from a UX perspective. Perhaps not following symlinks is a start but not the whole thing? Maybe we should just abort if there is a dangerous configuration? I wonder how prevalent this is.
```
and here is suggestions on patching:
```
Hi @asn,
I agree here, the attack is impossible against default setups and takes quite specific steps to be exploitable.
To fix this vulnerability at its root, I recommend to adjust the function `start_writing_to_file` in `src/lib/fs/files.c`. The system call `mkstemp` (included in POSIX) guarantees to create a unique file name for the (optional) temporary file. This way, an attacker cannot prepare a file before tor tries to create it. In case of conflict, mkstemp iterates through a huge pool of possible names and if all fail, it returns -1.
It must be checked if mkstemp is a viable option on all target systems, especially Windows. But it's POSIX, so it should work.
The attached patch performs these changes, but breaks a test which would be redundant then (it fails because it tries to create a temporary file beforehand, which `mkstemp` successfully prevents).
###Next improvement to consider (for attack 2):
As far as I understand the code, possible modes for files are:
- 0600 (default)
- 0640 (if configuration requests group-writable files, non-Windows systems only)
- 0400 (tor-gencert, which is not expected to run on Windows according to manual page)
If the special case 0400 is handled in tor-gencert directly, the functions in `src/lib/fs/files.c` can be further reduced in their feature set: Just add a "group readable" attribute to these functions and remove the explicit `mode` (if present):
- start_writing_to_stdio_file
- start_writing_to_file
- write_str_to_file
- write_bytes_to_file
All these functions would use 0600 by default and only support 0640 if the boolean "group readable" flag is set to true -- and that will only happen on non-Windows systems.
With these changes in place, the remaining `chmod` calls (except for the control socket) can be removed and that will also fix the second attack (and gives full atomic control to the newly introduced `fchmod` call):
###Before:
```
if (write_str_to_file(options->ControlPortWriteToFile, joined, 0) < 0) {
log_warn(LD_CONTROL, "Writing %s failed: %s",
options->ControlPortWriteToFile, strerror(errno));
}
#ifndef _WIN32
if (options->ControlPortFileGroupReadable) {
if (chmod(options->ControlPortWriteToFile, 0640)) {
log_warn(LD_FS,"Unable to make %s group-readable.",
options->ControlPortWriteToFile);
}
}
#endif /* !defined(_WIN32) */
```
###After:
```
if (write_str_to_file(options->ControlPortWriteToFile,
options->ControlPortFileGroupReadable, joined, 0) < 0) {
log_warn(LD_CONTROL, "Writing %s failed: %s",
options->ControlPortWriteToFile, strerror(errno));
}
```https://gitlab.torproject.org/tpo/core/tor/-/issues/31022Tor's windows "--service install" should warn if it installs on a global writ...2023-09-13T17:24:46ZGeorge KadianakisTor's windows "--service install" should warn if it installs on a global writeable pathSeems like there is a platform-specific (windows) configuration-specific (requires multi-user setup, and specific install proceedure) local root exploit on Windows, if "--service install" is used on the wrong directory level.
In the fut...Seems like there is a platform-specific (windows) configuration-specific (requires multi-user setup, and specific install proceedure) local root exploit on Windows, if "--service install" is used on the wrong directory level.
In the future we should warn if "--service install" is used insecurely, and we should provide installer wizards to do this right.
IMO this is a very unlikely issue so I assigned it to 042, but feel free to move if you think so.
Report inlined:
```
Title: When tor.exe is running as a Windows service, it may be subject to privilege escalation
Scope: None
Weakness: Privilege Escalation
Severity: Low
Link: https://hackerone.com/reports/602533
Date: 2019-06-06 18:17:39 +0000
By: @xiaoyinl
Details:
According to https://2019.www.torproject.org/docs/faq#NTService, you can run Tor as a Windows service. To install Tor as a service, you run `tor --service install`. However, the installed Tor service uses the same tor.exe image path as the service path. The Tor service runs under `NT authority\local service` account, so if an admin unzips tor.exe into a folder that is writable by non-admin users (e.g. C:\tor), then a malicious standard user can gain LocalService privilege by planting a malicious DLL into the folder where tor.exe is located.
To make things worse, it's common that admins unzip tor.exe into a nonadmin-writable directory, because if it's unzipped into one of the admins' user directories (like Downloads, Documents, etc.), then the service won't even run, because LocalService account has no access to admin's directories. Actually, the OP of https://trac.torproject.org/projects/tor/ticket/29345 "fixed" his problem by unzipping tor into C:\\:
> In fact, if you extract tor files in a Tor folder located in C:\ you probably won't have this problem of permissions
This unfortunately made him vulnerable to privilege escalation.
**Reproduce**:
1. download Tor from https://www.torproject.org/dist/torbrowser/8.5.1/tor-win32-0.3.5.8.zip
2. unzip it into C:\\tor-win32-0.3.5.8.
3. Open an admin command prompt, run C:\\tor-win32-0.3.5.8\\Tor\\tor.exe --service install
4. Log in a standard Windows user, create a malicious iphlpapi.dll, and copy this file into C:\\tor-win32-0.3.5.8\\Tor\\
5. Restart your system. The malicious iphlpapi.dll should run.
**Fix**:
To fix this bug, when installed as a service, copy Tor's executable folder into a protected directory, like C:\\Program Files, or C:\\Windows. Then use the protected tor.exe as the service path.
## Impact
A malicious Windows local standard user can gain LocalService privilege. He can then deanonymize Tor traffic, and can interfere other Windows services running on LocalService account.
2019-06-07 10:04:29 +0000: @xiaoyinl (comment)
This report is about local privilege escalation. There is no social engineering involved. The attacker is a **local** non-administrator user, so the attacker can copy the malicious dll file to `C:\tor-win32-0.3.5.8\Tor\` himself. Then the attacker can have access to LocalService data files and Registry hives.
```https://gitlab.torproject.org/tpo/core/tor/-/issues/30578The circuitpadding_circuitsetup_machine test: Re-enable, remove, re-document,...2022-06-24T14:58:36ZNick MathewsonThe circuitpadding_circuitsetup_machine test: Re-enable, remove, re-document, or ___?Our code in `test_circuitpadding.c` says:
```
/** Disabled unstable test until #29298 is implemented (see #29122) */
// TEST_CIRCUITPADDING(circuitpadding_circuitsetup_machine, TT_FORK),
```
But both legacy/trac#29298 and legacy/tr...Our code in `test_circuitpadding.c` says:
```
/** Disabled unstable test until #29298 is implemented (see #29122) */
// TEST_CIRCUITPADDING(circuitpadding_circuitsetup_machine, TT_FORK),
```
But both legacy/trac#29298 and legacy/trac#29122 are closed now.
If this test will work now, let's enable it. If it is no longer useful, let's remove it. If it is disabled for some reason other than the one that's described in the comment, let's adjust the comment.https://gitlab.torproject.org/tpo/core/tor/-/issues/30545must not *skip* obsolete options.2022-03-01T16:08:40Zweasel (Peter Palfrader)must not *skip* obsolete options.Tor version 0.3.5.8 ignores DNSListenAddress.
While it logs
May 21 08:59:58 raven Tor-transparent[32235]: Skipping obsolete configuration option 'TransListenAddress'
May 21 08:59:58 raven Tor-transparent[32235]: Skipping obsolete config...Tor version 0.3.5.8 ignores DNSListenAddress.
While it logs
May 21 08:59:58 raven Tor-transparent[32235]: Skipping obsolete configuration option 'TransListenAddress'
May 21 08:59:58 raven Tor-transparent[32235]: Skipping obsolete configuration option 'DNSListenAddress'
it still starts, and then *binds* to 127.0.0.1:53, breaking all sorts of things.
If it finds obsolete configuration options, it really, really, really must not ignore then and do unwanted stuff.