Tor issueshttps://gitlab.torproject.org/tpo/core/tor/-/issues2023-06-15T10:31:28Zhttps://gitlab.torproject.org/tpo/core/tor/-/issues/40809Bring padding machines into sync with Tobias's latest changes2023-06-15T10:31:28ZMike PerryBring padding machines into sync with Tobias's latest changesTobias has added some changes to the padding machines in his latest research: in particular, the padding machine can respond to queue length/congestion signals. I believe there are now also probabilistic transitions (ie https://gitlab.to...Tobias has added some changes to the padding machines in his latest research: in particular, the padding machine can respond to queue length/congestion signals. I believe there are now also probabilistic transitions (ie https://gitlab.torproject.org/tpo/core/tor/-/issues/31636 or https://gitlab.torproject.org/tpo/core/tor/-/issues/31787).
I need to read his latest paper, sync with him, and discuss these things. This will generate new tickets.Mike PerryMike Perryhttps://gitlab.torproject.org/tpo/core/tor/-/issues/40422[CircuitPadding] circpad_add_matching_machines() should be called when a cir...2023-06-09T13:26:45ZJaym[CircuitPadding] circpad_add_matching_machines() should be called when a circuit has opened.### Summary
The circuit padding framework supports negotiating padding upon various events. Among them, CIRCPAD_CIRC_OPENED states that a given padding machine should be applied to a circuit when a circuit has opened.
However, no code ...### Summary
The circuit padding framework supports negotiating padding upon various events. Among them, CIRCPAD_CIRC_OPENED states that a given padding machine should be applied to a circuit when a circuit has opened.
However, no code seems to trigger this mechanism. When a circuit has built, the function circpad_machine_event_circ_built() is called and checks whether some machine may be removed/added to the circuit. However, at this stage of the circuit building process, the circuit has built but is not marked as open yet.
### Bug
If some machine uses `client_machine->conditions.apply_state_mask = CIRCPAD_CIRC_OPENED;` the machine would only be applied when another event than a circ building/opening triggers the function circpad_add_matching_machines() (e.g., ap conn links a stream, or the circ purpose changes from general to something else).
### What is the expected behavior?
When circuituse.c calls circuit_has_opened(), it should also call the circpad module; e.g., a new function circpad_machine_event_circ_opened() that checks for adding machine to the circuit.
### Environment
Running a version forked from 0.4.5.7
### Relevant logs and/or screenshots
Contains some logs showing a call to circpad_machine_event_circ_built() while the circuit is still marked as building. Also contains custom logs:
```Jun 30 11:23:50.000 [info] circuit_finish_handshake(): Finished building circuit hop:
Jun 30 11:23:50.000 [info] internal (high-uptime) circ (length 3, last hop test000a): $22BA781A60C0CBB7FFAEA8858128427F67F60038(open) $7684DE04DCBB44538554E2CD1D14CDF836D5AF4D(open) $C7ADB1DBCE99F0B2ED2812B1953E4986EE9846DB(open)
Jun 30 11:23:50.000 [debug] dispatch_send_msg_unchecked(): Queued: ocirc_cevent (<gid=7 evtype=2 reason=0 onehop=0>) from or, on ocirc.
Jun 30 11:23:50.000 [debug] dispatcher_run_msg_cbs(): Delivering: ocirc_cevent (<gid=7 evtype=2 reason=0 onehop=0>) from or, on ocirc:
Jun 30 11:23:50.000 [debug] dispatcher_run_msg_cbs(): Delivering to btrack.
Jun 30 11:23:50.000 [debug] btc_cevent_rcvr(): CIRC gid=7 evtype=2 reason=0 onehop=0
Jun 30 11:23:50.000 [debug] circuit_build_times_add_time(): Adding circuit build time 43
Jun 30 11:23:50.000 [debug] circpad_machine_conditions_apply(): Checking circuit purpose, 5
Jun 30 11:23:50.000 [debug] circpad_machine_conditions_apply(): Checking condition state mask 21 vs condition: 2
Jun 30 11:23:50.000 [debug] circpad_machine_conditions_apply(): Checking circuit purpose, 5
Jun 30 11:23:50.000 [debug] circpad_machine_conditions_apply(): Checking circuit purpose, 5
Jun 30 11:23:50.000 [debug] circpad_machine_event_circ_built(): Circpad module event circ built -- circ state: 0
Jun 30 11:23:50.000 [debug] circpad_machine_conditions_apply(): Checking circuit purpose, 5
Jun 30 11:23:50.000 [debug] circpad_machine_conditions_apply(): Checking condition state mask 21 vs condition: 2
Jun 30 11:23:50.000 [debug] circpad_machine_conditions_apply(): Checking circuit purpose, 5
Jun 30 11:23:50.000 [debug] circpad_machine_conditions_apply(): Checking circuit purpose, 5
Jun 30 11:23:50.000 [debug] invoke_plugin_operation_or_default(): Plugin found for caller calling a plugin in the circpad module when a circuit has built
Jun 30 11:23:50.000 [info] circpad_dropmark_activate_when_built(): Looks like the client_dropmark_def machine does not exist over this circuit
Jun 30 11:23:50.000 [debug] plugin_run(): Plugin execution returned -2147483648
Jun 30 11:23:50.000 [debug] plugin_run(): vm error message: (null)
Jun 30 11:23:50.000 [info] entry_guards_note_guard_success(): Recorded success for primary confirmed guard test002r ($22BA781A60C0CBB7FFAEA8858128427F67F60038)
Jun 30 11:23:50.000 [debug] dispatch_send_msg_unchecked(): Queued: ocirc_state (<gid=7 state=4 onehop=0>) from or, on ocirc.
Jun 30 11:23:50.000 [debug] dispatcher_run_msg_cbs(): Delivering: ocirc_state (<gid=7 state=4 onehop=0>) from or, on ocirc:
Jun 30 11:23:50.000 [debug] dispatcher_run_msg_cbs(): Delivering to btrack.
Jun 30 11:23:50.000 [debug] btc_state_rcvr(): CIRC gid=7 state=4 onehop=0
Jun 30 11:23:50.000 [info] circuit_build_no_more_hops(): circuit built!
Jun 30 11:23:50.000 [info] pathbias_count_build_success(): Got success count 3.000000/3.000000 for guard test002r ($22BA781A60C0CBB7FFAEA8858128427F67F60038)
Jun 30 11:23:50.000 [debug] circuit_has_opened(): calling circuit_has_opened()
```
### Possible fixes
Add a new function circpad_machine_event_circ_opened() called from circuituse.c when the circuit has opened.Tor: 0.4.8.x-freezeMike PerryMike Perryhttps://gitlab.torproject.org/tpo/core/tor/-/issues/40277Circuit Padding attack scenario through circpad_global_max_padding_pct2023-06-09T13:26:55ZGhost UserCircuit Padding attack scenario through circpad_global_max_padding_pctHello *,
I am from [RadicallyOpenSecurity](https://www.radicallyopensecurity.com/) and currently performing a short review of the [Padding Machines for Tor](https://nlnet.nl/project/TorPaddingMachines/) NGI Zero PET project.
During dis...Hello *,
I am from [RadicallyOpenSecurity](https://www.radicallyopensecurity.com/) and currently performing a short review of the [Padding Machines for Tor](https://nlnet.nl/project/TorPaddingMachines/) NGI Zero PET project.
During discussion with the padding machine author Tobias Pulls (@pulls), I noticed a general attack scenario that could be of relevance for the padding framework implementation.
The general idea of the attack is that a malicious Tor client can, under some circumstances, repeatedly force the circuit padding logic of middle relays to reply with more padding bytes than non-padding bytes. After some time of sustained attacks, the target middle relay will develop a statistical circuit padding overhead percentage which is over the the network-wide `circpad_global_max_padding_pct` limit (see [section 3.5](https://gitweb.torproject.org/tor.git/tree/doc/HACKING/CircuitPaddingDevelopment.md?id=33380f6b2770a3c4d9e2a9cc8a4b18c71a40571b#n614) of the documentation), at which point it will stop sending padding on any of its circuits.
Once the attack stops, the ratio between sent padding and sent non-padding of the relay goes back to normal over time while the relay serves non-padded cell data to legitimate clients, so it reverts automatically to a safe state again where it has working padding machines.
However, there are several notable details:
* To our knowledge, the currently rolled out padding machines can not be used to trigger this problem. This is likely only a problem for future machines with high volume that defend against website fingerprinting attacks.
* Due to the event-based nature of client padding machines, the suppression of padding generation at the middle relay means that the `CIRCPAD_EVENT_PADDING_RECV` transition on the client is never taken. This results in the client not sending any padding, or performing only a very limited subset of the regular padding behavior (depending on the machine definition). This amplifies the practical impact of the missing padding from the relay.
* Since the relay behavior change affects all of its circuits at once and may happen several times over a short time-span, the resulting traffic anomalies may be a fingerprinting opportunity for adversaries that are performing traffic analysis on the network.
* According to @pulls, the consequences of this attack on the relay server would only be logged at `info` level and may go unnoticed by relay operators.
* There is a threshold mechanism at the individual circuit machine level through the `relay->max_padding_percent` setting. At first glance, this appears to mitigate the described attack scenario. However, note that this limit is not applied to the first `relay->allowed_padding_count` number of cells, so an attacker can circumvent this limitation by repeatedly triggering new padding machines. Since it appears to be beneficial for padding machines to have this initial level of freedom over padding for their effectiveness against fingerprinting, the attack is unlikely to be mitigated with restrictive machine limits without also impacting the usefulness of the machines themselves.
* The attack can also be performed in the other direction by a malicious middle relay against clients. In this case, it would disable/degrade padding on all of the client circuits. However, we regard this variant as less impactful than client->relay attacks, in part because this attack is harder to scale.
Some initial recommendations:
* Higher severity logging when exceeding `circpad_global_max_padding_pct`.
* The consensus-provided global limits could be enforced *per circuit* instead of truly globally. The root of the problem is shared state across circuits. This would still make the global padding limits useful, since in the future there might be multiple machines active throughout the lifetime of a circuit.
* Based on the information from @pulls , the `CircuitPaddingDisabled` (see issue [28693](https://gitlab.torproject.org/legacy/trac/-/issues/28693)) switch might be sufficient as an emergency fail-safe in case of network-wide padding problems.
I would like to thank @pulls for walking me through the various aspects of related network behavior and providing other helpful suggestions.Mike PerryMike Perryhttps://gitlab.torproject.org/tpo/core/tor/-/issues/40230If relays have the Exit flag, don't give them the Guard flag2023-04-12T15:23:08ZRoger DingledineIf relays have the Exit flag, don't give them the Guard flagMotivated by this tor-relays@ thread:
https://lists.torproject.org/pipermail/tor-relays/2020-October/019014.html
In short, there's an edge case in our "Bandwidth-weight Case 3a (E scarce)" where exit capacity is scarce but not that scarc...Motivated by this tor-relays@ thread:
https://lists.torproject.org/pipermail/tor-relays/2020-October/019014.html
In short, there's an edge case in our "Bandwidth-weight Case 3a (E scarce)" where exit capacity is scarce but not that scarce, so Wgd ("the weight for choosing a Guard+Exit for the guard position") can become non-zero by a little bit.
What this means in practice is that when we hit that edge case, relays that have both the Guard flag and the Exit flag will be used just a tiny bit as guards by clients. But because of how clients actually use guards (all or nothing, rather than spreading out the load over all the guards), the reality for a relay with a tiny guard weight is that it will accumulate only a handful of clients.
And that situation is extra-scary for those few unlucky clients -- their "guard fingerprint" is a much more distinguishable feature for them because so few other people have it, and also when a guard has one-ish active user, middle relays can know that circuits coming from that guard are likely to be that user.
All in all, it seems much safer to eliminate this situation: dedicate Exits to being exits (or they can be middles too if there's enough exit capacity), and dedicate Guards to being entries (or they can be middles too if there's enough guard capacity).
I hear that @mikeperry wants this feature too, for padding efficiency, because having low-use guards but still needing to pad at them is a suboptimal use of our network bandwidth.
Now, how exactly we should implement this feature remains to be chosen. I think all of the good ways involve doing it at the dir auths (i.e. clients and relays shouldn't have to care how we do it or even *that* we did it). One option (super easy to do) is that if we're giving it the Exit flag, we withhold the Guard flag. Another option (requires a new consensus method, and also a Mike) is that we redo the consensus weighting to handle case 3a ("exit scarce") differently. Are there other approaches that would accomplish the goal?
There are some edge cases to consider here, for example, in a test network that is all Exits, if we therefore never give out the Guard flag, things might get bad. We could handle that either by having this setting be a torrc option that gets overwritten when TestingTorNetwork, or ...maybe Mike's plan of rewriting the linear equations will handle that edge case automatically?https://gitlab.torproject.org/tpo/core/tor/-/issues/31788Circuit padding trace simulator2022-09-01T21:42:49ZMike PerryCircuit padding trace simulatorThis is the parent ticket for the pieces of work required to make it possible to use circuitpadding.c in a trace simulator outside of Tor, so that defenses could be re-applied to crawl traces quickly without needing to re-crawl a set of ...This is the parent ticket for the pieces of work required to make it possible to use circuitpadding.c in a trace simulator outside of Tor, so that defenses could be re-applied to crawl traces quickly without needing to re-crawl a set of sites.
An alternate way to do this, instead of extracting this code, is to make use of our unit testing framework and build the tracer as a unit test. We have mechanisms to mock the networking functions so that they output new traces, and then we can use the unit-test style to read in a trace file and output a new one, instead of performing a test.https://gitlab.torproject.org/tpo/core/tor/-/issues/31787Full HMM support for circuit padding state transition2023-06-08T18:27:55ZMike PerryFull HMM support for circuit padding state transitionThe circuit padding state transition system is quasi-deterministic. A non-deterministic state machine can be built from this using many states through clever use of the infinity bin, but this will lead to excessively complicated machines...The circuit padding state transition system is quasi-deterministic. A non-deterministic state machine can be built from this using many states through clever use of the infinity bin, but this will lead to excessively complicated machines.
One can imagine a non-deterministic state machine/HMM that simulates an entire web browsing session on a circuit. If this kind of machine seems useful, it may be better simply to transform the circpad_state_t.next_state transition field into an array of structs that specifies the next state as well as a probability for transitioning to that state for that event.
We would accept patches for this modification if it seems useful.https://gitlab.torproject.org/tpo/core/tor/-/issues/31653Padding cells sent with 0ms delay cause circuit failures2023-06-09T13:27:09ZpullsPadding cells sent with 0ms delay cause circuit failuresBelow appears to be a bug that results in a closed circuit due to a relay sending a padding cell (RELAY_COMMAND_DROP) to the client.
At links below you find code for two circuit padding machines:
1. circpad_machine_client_close_circuit_...Below appears to be a bug that results in a closed circuit due to a relay sending a padding cell (RELAY_COMMAND_DROP) to the client.
At links below you find code for two circuit padding machines:
1. circpad_machine_client_close_circuit_minimal
2. circpad_machine_relay_close_circuit_minimal
Machine 1 runs on a client on CIRCUIT_PURPOSE_C_GENERAL circuits with 3 hops as soon as CIRCPAD_CIRC_OPENED: the only thing it does is set a timer 1000s in the future and waits for the timer to expire. The purpose of the machine is to negotiate padding with the relay to activate Machine 2 on the middle relay.
Machine 2 runs at the middle relay and repeatedly sends a padding cell to the client 1 usec after the event CIRCPAD_EVENT_NONPADDING_SENT. In other words, for every non-padding cell we directly add a padding cell. At the client, this causes `relay_decrypt_cell(): Incoming cell at client not recognized. Closing.` for all circuits triggering Machine 2 at the relay. Closing a circuit by injecting padding cells seems unintended.
Here is part of a log from a client on info level showing how the machine registers, negotiates with the relay (starting Machine 2 at the relay), eventually fails to decrypt, and the circuit closes.
```
Sep 05 10:32:19.000 [info] circpad_setup_machine_on_circ(): Registering machine client_close_circuit_minimal to origin circ 3 (5)
Sep 05 10:32:19.000 [info] circpad_node_supports_padding(): Checking padding: supported
Sep 05 10:32:19.000 [info] circpad_negotiate_padding(): Negotiating padding on circuit 3 (5), command 2
Sep 05 10:32:19.000 [info] link_apconn_to_circ(): Looks like completed circuit to [scrubbed] does allow optimistic data for connection to [scrubbed]
Sep 05 10:32:19.000 [info] connection_ap_handshake_send_begin(): Sending relay cell 0 on circ 3296464733 to begin stream 20575.
Sep 05 10:32:19.000 [info] connection_ap_handshake_send_begin(): Address/port sent, ap socket 13, n_circ_id 3296464733
Sep 05 10:32:19.000 [info] rep_hist_note_used_port(): New port prediction added. Will continue predictive circ building for 2355 more seconds.
Sep 05 10:32:19.000 [info] connection_edge_process_inbuf(): data from edge while in 'waiting for circuit' state. Leaving it on buffer.
Sep 05 10:32:19.000 [info] exit circ (length 3): $[manually-scrubbed](open) $[manually-scrubbed](open) $[manually-scrubbed](open)
Sep 05 10:32:19.000 [info] pathbias_count_use_attempt(): Used circuit 3 is already in path state use attempted. Circuit is a General-purpose client currently open.
Sep 05 10:32:19.000 [info] link_apconn_to_circ(): Looks like completed circuit to [scrubbed] does allow optimistic data for connection to [scrubbed]
Sep 05 10:32:19.000 [info] connection_ap_handshake_send_begin(): Sending relay cell 0 on circ 3296464733 to begin stream 20576.
Sep 05 10:32:19.000 [info] connection_ap_handshake_send_begin(): Address/port sent, ap socket 14, n_circ_id 3296464733
Sep 05 10:32:19.000 [info] circpad_deliver_recognized_relay_cell_events(): Got padding cell on origin circuit 3.
Sep 05 10:32:20.000 [info] relay_decrypt_cell(): Incoming cell at client not recognized. Closing.
Sep 05 10:32:20.000 [info] circuit_receive_relay_cell(): relay crypt failed. Dropping connection.
Sep 05 10:32:20.000 [info] command_process_relay_cell(): circuit_receive_relay_cell (backward) failed. Closing.
Sep 05 10:32:20.000 [info] circpad_circuit_machineinfo_free_idx(): Freeing padding info idx 0 on circuit 3 (23)
Sep 05 10:32:20.000 [info] circpad_node_supports_padding(): Checking padding: supported
Sep 05 10:32:20.000 [info] circpad_negotiate_padding(): Negotiating padding on circuit 3 (23), command 1
Sep 05 10:32:20.000 [info] pathbias_send_usable_probe(): Sending pathbias testing cell to 0.233.9.53:25 on stream 20577 for circ 3.
Sep 05 10:32:20.000 [info] relay_decrypt_cell(): Incoming cell at client not recognized. Closing.
Sep 05 10:32:20.000 [info] circuit_receive_relay_cell(): relay crypt failed. Dropping connection.
Sep 05 10:32:20.000 [info] command_process_relay_cell(): circuit_receive_relay_cell (backward) failed. Closing.
Sep 05 10:32:20.000 [info] pathbias_send_usable_probe(): Got pathbias probe request for circuit 3 with outstanding probe
Sep 05 10:32:20.000 [info] pathbias_check_close(): Circuit 3 closed without successful use for reason 2. Circuit purpose 23 currently 1,open. Len 3.
Sep 05 10:32:20.000 [info] circuit_mark_for_close_(): Circuit 3296464733 (id: 3) marked for close at src/core/or/command.c:582 (orig reason: 2, new reason: 0)
Sep 05 10:32:20.000 [info] connection_edge_destroy(): CircID 0: At an edge. Marking connection for close.
Sep 05 10:32:20.000 [info] connection_edge_destroy(): CircID 0: At an edge. Marking connection for close.
Sep 05 10:32:20.000 [info] circuit_free_(): Circuit 0 (id: 3) has been freed.
```
If we delay sending the padding from the relay I cannot trigger the bug (see commented out fix in the Machine 2 function). With the code below, the code triggers on every circuit that has the machine negotiated.
Code: https://github.com/pylls/tor/tree/circuit-padding-relay-padding-bug (https://github.com/pylls/tor/blob/circuit-padding-relay-padding-bug/src/core/or/circuitpadding_machines.c#L460, as well as circuitpadding_machines.h and registration in circpad_machines_init() of circuitpadding.c).Mike PerryMike Perryhttps://gitlab.torproject.org/tpo/core/tor/-/issues/31636Circuit padding: Add meta probability distribution type2023-06-08T18:27:55ZMike PerryCircuit padding: Add meta probability distribution typeTobias Pulls pointed out that his APE system actually randomized the *parameters* of the probability distributions that he used, for each circuit.
If this is indeed helpful (it probably is), we should find out which form or forms of per...Tobias Pulls pointed out that his APE system actually randomized the *parameters* of the probability distributions that he used, for each circuit.
If this is indeed helpful (it probably is), we should find out which form or forms of per-circuit parameter randomization is best to use, and get it supported by the circuit padding framework.https://gitlab.torproject.org/tpo/core/tor/-/issues/30578The circuitpadding_circuitsetup_machine test: Re-enable, remove, re-document,...2022-06-24T14:58:36ZNick MathewsonThe circuitpadding_circuitsetup_machine test: Re-enable, remove, re-document, or ___?Our code in `test_circuitpadding.c` says:
```
/** Disabled unstable test until #29298 is implemented (see #29122) */
// TEST_CIRCUITPADDING(circuitpadding_circuitsetup_machine, TT_FORK),
```
But both legacy/trac#29298 and legacy/tr...Our code in `test_circuitpadding.c` says:
```
/** Disabled unstable test until #29298 is implemented (see #29122) */
// TEST_CIRCUITPADDING(circuitpadding_circuitsetup_machine, TT_FORK),
```
But both legacy/trac#29298 and legacy/trac#29122 are closed now.
If this test will work now, let's enable it. If it is no longer useful, let's remove it. If it is disabled for some reason other than the one that's described in the comment, let's adjust the comment.https://gitlab.torproject.org/tpo/core/tor/-/issues/29098Load balance properly in the presence of padding overhead2022-06-24T14:19:31ZMike PerryLoad balance properly in the presence of padding overheadWTF-PAD needs new load balancing equations that take our expected overhead from padding into account
https://gitweb.torproject.org/torspec.git/tree/proposals/265-load-balancing-with-overhead.txtWTF-PAD needs new load balancing equations that take our expected overhead from padding into account
https://gitweb.torproject.org/torspec.git/tree/proposals/265-load-balancing-with-overhead.txthttps://gitlab.torproject.org/tpo/core/tor/-/issues/29084Ensure circuit padding RTT estimate handes var cells/wide creates2023-06-15T10:31:27ZGeorge KadianakisEnsure circuit padding RTT estimate handes var cells/wide createsThe use_rtt_estimate field in the circuit padding machines lets machines offset the inter-packet delays by a middle-node estimated RTT value of packets that go from the middle to the exit/website.
We abort this measurement if we get two...The use_rtt_estimate field in the circuit padding machines lets machines offset the inter-packet delays by a middle-node estimated RTT value of packets that go from the middle to the exit/website.
We abort this measurement if we get two or more cells back-to-back in either direction, as this indicates that the half-duplex request/response circuit setup and RELAY_BEGIN sequence has finished.
However, if we switch to a multi-cell circuit handshake, then we will need to take that into account when measuring RTT.
If RELAY_EARLY is used only for the first cell of a multi-cell EXTEND2 payload,
then we can just count time between RELAY_EARLIES. But the proposal currently says MAY, but not MUST for this behavior.Mike PerryMike Perryhttps://gitlab.torproject.org/tpo/core/tor/-/issues/29083WTF-PAD: Specify exit policy for machine conditions2022-09-01T21:29:27ZGeorge KadianakisWTF-PAD: Specify exit policy for machine conditionsFrom the TODO file:
```
- Specify exit policy for machine conditions?
- short_policy_t looks good, except for its flexible array member :/
- Can we make our own struct with a small, fixed number of policy
entries? Say...From the TODO file:
```
- Specify exit policy for machine conditions?
- short_policy_t looks good, except for its flexible array member :/
- Can we make our own struct with a small, fixed number of policy
entries? Say 3-4? Or is that a bad idea to lose this flexibility?
- Check conditions based on attached streams on the circuit
- Accept should mean "only apply if matched"
- Reject should mean "don't apply if matched"
- If a policy is specified, Reject *:* is implicit default (so reject
policies need an Accept entry).
- With no policy, Accept *:* is implicit default.
```