We have two conflicting notions of channel_is_client()

changed milestone to %Tor: 0.2.9.x-final

added 029-backport-maybe-with-21406 030-backport-maybe-with-21406 031-backport 032-backport component::core tor/tor milestone::Tor: 0.2.9.x-final parent::24902 points::0.5 priority::medium resolution::fixed reviewer::teor severity::normal status::closed tor-relay type::defect version::tor 0.3.1.1-alpha labels

Replying to arma:

full of false positives

I discovered this bug while working on the current circuit / conn overload patches, considering blacklisting client IP addresses that misbehave in certain ways. I found that my fast relay has a lot of conns that are marked as channel_is_client() that are very clearly connections to/from relays.

I assume the conns started off not marked as clients, and then the clause in command.c declared them as clients.

Right now, in channel_tls_process_netinfo_cell(), for the OR_CONN_STATE_OR_HANDSHAKING_V3 case, we call channel_mark_client() properly.

I propose that we call everything a client that handshakes with tor_tls_used_v1_handshake() or OR_CONN_STATE_OR_HANDSHAKING_V2, since relays should never use those legacy handshakes with other relays at this point.

And then we can happily remove the dangerous clause from command.c.

Sounds good to me.

(I wrote the first definition to fix an earlier issue where we used CREATE_FAST to mark clients. I wasn't aware of the second.)

This works fine for me in the latest chutney, that comment it outdated.

But please don't remove these lines, they are used by rep_hist:

  if (connection_or_digest_is_known_relay(chan->identity_digest)) {
    rep_hist_note_circuit_handshake_requested(create_cell->handshake_type);

I restored this code in my branch bug24898 on https://github.com/teor2345/tor.git

Now we just have to decide how far to backport it. If we want it to go to 0.3.0 or 0.2.9, we need to backport #21406 (moved) as well.

Trac:
Status: new to merge_ready
Points: N/A to 0.5
Version: N/A to Tor: 0.3.1.1-alpha
Reviewer: N/A to teor
Keywords: N/A deleted, 029-backport-maybe-with-21406, 031-backport, tor-relay, 030-backport-maybe-with-21406, 032-backport added

Comment written in parallel to teor's above comment:

I pushed a bug24898 branch that I think fixes everything.

It's on master though, and we might want to backport this. I'm not sure how far back to backport it -- we could go back to maint-0.3.1 pretty easily (there's a conflict but it's just because of a comment), but going back to 0.2.9 would be quite tricky because #21406 (moved) went into 0.3.1.

The question of whether to backport more depends on how much we want these fixes in 0.2.9 -- this one is a cornerstone of some of the DoS mitigations that might come next.

Great, I made a bug24898-2 branch that goes on master and includes teor's fix.

I also just made a bug24898-031 branch that goes on maint-0.3.1 if we want to merge it to there.

I am happy with the latest branch, let's merge it to master and consider back to 0.3.1.

I added #22805 (moved) as a child, because those changes are related as well. But 0.2.9 and 0.3.0 have been fine without any of this other code.

If we want to go back earlier than 0.3.1, let's only backport the parts of channel_tls_process_netinfo_cell() in #21406 (moved) that we need to get this working, and leave out everything else.

(This code sets chan->is_client = 0; directly in 0.3.0, and it does the rep_hist thing in another place, so we'll need another branch if we want to backport there. The good news is that the create cell bug was introduced in 0.3.1 with the netflow padding code, so we only need to backport the v1/2 fix to 0.3.0 and earlier.)

Commit f5ff9f23 in my bug24898-more branch has a bunch of fixes to replace connection_or_digest_is_known_relay() with channel_is_client().

Feel free to merge the commit straight-up, or feel free to strip it down for parts and take pieces of it for your own commits for all these related tickets.

Replying to arma:

Commit f5ff9f23 in my bug24898-more branch has a bunch of fixes to replace connection_or_digest_is_known_relay() with channel_is_client().

Feel free to merge the commit straight-up, or feel free to strip it down for parts and take pieces of it for your own commits for all these related tickets.

Let's deal with that in #23423 (moved), because all the details are there. It's not something we're going to backport.

If we want to backport arma's bug24898-031 to 0.2.9 and 0.3.0, we need to backport:

the channel_mark_client() in connection_or_set_state_open() from commit af8cadf3a9 in branch bug24898-031 in this ticket
the channel_mark_client() in channel_tls_process_netinfo_cell() from commit 46fe353f25 in master in #21406 (moved)

I am still happy with merging arma's bug24898-031 to 0.3.1 and arma's bug24898-2 to master.

This patch sometimes causes mixed chutney networks to ~~fail~~ log warnings. That's a bug in the unpatched version, not the patch.

Will this cause warnings on the real network too?

Why (did we introduce the create-cell check in the first place? ISTR that we thought we had a reason to do so. It looks like Mike introduced it in b0e92634d85a3bf7612a6ce0339b96e4aad1e0bb, and moved it in 02a5835c27780e45f705fc1c044b9c471b929dbe. Adding mike to cc.

We could skip the 0.3.0 backport, I think, since 0.3.0 is EOL at the end of this month. But we should do an 0.2.9 backport if we think this is important: 0.2.9 is supported till 2020.

Trac:
Cc: N/A to mikeperry

Trac:
Keywords: N/A deleted, review-group-30 added

Replying to nickm:

Will this cause warnings on the real network too?

Why (did we introduce the create-cell check in the first place? ISTR that we thought we had a reason to do so. It looks like Mike introduced it in b0e92634d85a3bf7612a6ce0339b96e4aad1e0bb, and moved it in 02a5835c27780e45f705fc1c044b9c471b929dbe. Adding mike to cc.

Mike's comments say that it led to chutney failures. I haven't observed any failures, but I have observed warnings in 0.3.2.9 when we apply this patch to master in a mixed network. That really is a bug in 0.3.1 and 0.3.2, because Mike's 0.3.1 fix is identifying clients incorrectly. There's nothing we can do about these warnings, except backport the fix.

We could skip the 0.3.0 backport, I think, since 0.3.0 is EOL at the end of this month. But we should do an 0.2.9 backport if we think this is important: 0.2.9 is supported till 2020.

It does appear to me that if we use the handshake patch from #21406 (moved) that we're going to be much better about identifying well-behaved clients correctly, and this should be better for the netflow code too (and not break it). I am a little worried about misbehaving clients - for those it feels smarter to check the digest in cases where that matters for DoS. But that should be a separate bug.

Thanks for this.

Edit: fix ticket number

I've merged this to 0.3.1 and forward. I'm marking it for a potential 0.2.9 backport.

Trac:
Milestone: Tor: 0.3.3.x-final to Tor: 0.2.9.x-final
Keywords: review-group-30 deleted, N/A added

I have made a bug24898-029 branch, designed for inclusion in 029 and 030. It will conflict with 031 because stuff is already fixed there, so you'll want to do the right version of 'merge -ours' after 030.

We will want this branch merged when we put the #24902 (moved) DoS patch into 029.

Trac:
Status: merge_ready to needs_review

This looks fine to me, I've been running a similar patch on 030 for a while now.

Do we still use CREATE_FAST to mark clients in 029? Or did we backport that fix as well?

Trac:
Parent: N/A to #24902 (moved)
Status: needs_review to merge_ready

Replying to teor:

Do we still use CREATE_FAST to mark clients in 029? Or did we backport that fix as well?

We still do it. That is, anybody who sends you a create-fast cell when you're running 029 or 030 now gets counted as a client, meaning the DoS defenses will be triggered against them (once #24902 (moved) gets backported), and meaning when we consider whether a channel can satisfy an extend request, this one can't.

Why, are we going to regret that? :) See also the discussions in #22805 (moved).

Replying to arma:

Replying to teor:

Do we still use CREATE_FAST to mark clients in 029? Or did we backport that fix as well?

We still do it. That is, anybody who sends you a create-fast cell when you're running 029 or 030 now gets counted as a client, meaning the DoS defenses will be triggered against them (once #24902 (moved) gets backported), and meaning when we consider whether a channel can satisfy an extend request, this one can't.

Why, are we going to regret that? :) See also the discussions in #22805 (moved).

I think we are going to regret that, and we should backport the parts of #22805 (moved) that remove the CREATE_FAST mark client.

We should also take them out on general principles, because it reduces the number of connections between relays.

Trac:
Status: merge_ready to needs_revision

Ok, I added a second commit to my bug24898-029 branch that backports that part from #22805 (moved). I think we are good to go now.

To be clear, that means commit 4bd208b wants to make it into 0.2.9 and 0.3.0, and its changes are already present in 0.3.1. Whereas commit 2076db4 wants to make it into 0.2.9 and 0.3.0 and 0.3.1, and its changes are already present in 0.3.2.

Sorry in advance for the mess Nick. I would just merge it all myself but I think I would screw up the "ours" aspects of the conflict handling. :)

Trac:
Status: needs_revision to merge_ready

Remove 030-backport from all open tickets that have it: 0.3.0 is now deprecated.

Is this still a must-backport-as-part-of-0.2.9 thing?

Replying to arma:

Ok, I added a second commit to my bug24898-029 branch that backports that part from #22805 (moved). I think we are good to go now.

To be clear, that means commit 4bd208b wants to make it into 0.2.9 and 0.3.0, and its changes are already present in 0.3.1. Whereas commit 2076db4 wants to make it into 0.2.9 and 0.3.0 and 0.3.1, and its changes are already present in 0.3.2.

Sorry in advance for the mess Nick. I would just merge it all myself but I think I would screw up the "ours" aspects of the conflict handling. :)

Yes, we must backport 4bd208b and 2076db4 to 0.2.9, and backport 2076db4 to 0.3.1, so that the DoS backport identifies clients correctly.

Everything has been backported it appears. Closing.

Trac:
Status: merge_ready to closed
Resolution: N/A to fixed

closed

changed time estimate to 4h

mentioned in issue #24902 (moved)

mentioned in issue #24904 (moved)

mentioned in issue #24905 (moved)

moved to tpo/core/tor#24898 (closed)

mentioned in issue tpo/core/tor#24904 (closed)

mentioned in issue tpo/core/tor#24905 (closed)

We have two conflicting notions of channel_is_client()

Child items ...

Activity