Circuit Build Timeout needs to count hidden service circuits

changed milestone to %Tor: 0.3.3.x-final in legacy/trac

added component::core tor/tor in Legacy / Trac guard-discovery-prop247-controller in Legacy / Trac mike-can in Legacy / Trac milestone::Tor: 0.3.3.x-final in Legacy / Trac needs-proposal in Legacy / Trac owner::mikeperry in Legacy / Trac parent::9001 in Legacy / Trac path-bias in Legacy / Trac priority::medium in Legacy / Trac prop247 in Legacy / Trac resolution::fixed in Legacy / Trac review-group-25 in Legacy / Trac review-group-26 in Legacy / Trac review-group-27 in Legacy / Trac reviewer::asn in Legacy / Trac severity::normal in Legacy / Trac status::closed in Legacy / Trac tor-guard in Legacy / Trac tor-hs in Legacy / Trac type::enhancement in Legacy / Trac version::tor 0.2.7 in Legacy / Trac labels

Ok, I have an initial patch for this in mikeperry/prop247_torrc-rebased (d730d6337d24f20980469e6bb331b093afec643b). However, along the way I also noticed legacy/trac#23114 (moved), which we should also fix so that the results from the legacy/trac#13837 (moved) experiments make sense.

This should have been in needs_review a while back!

Trac:
Status: new to needs_review

Thanks for the patch! This seems important! A few comments:

I don't understand how the mods to circuit_timeout_want_to_count_circ() helps us achieve the goal of this ticket. It kinda seems like we are restricting that function even more (doc change implies that too), when we actually want it to be more accepting. I think a nice comment is required for people like me to understand how that function works wrt HS circs.
I also don't understand why the circuit_build_times_decide_to_count_circ() logic was moved from one function to another. I think we need some help here to understand, since I'm not familiar with the CBT system at all.
I think circuit_get_cpath_len() should be moded instead of adding another func circuit_get_cpath_opened_len(). We can add an arg only_count_opened which does that. I think that'd be cleaner.
circuit_build_times_decide_to_count_circ() is undocumented and I'd actually like to know what it does.

Trac:
Status: needs_review to needs_revision

setting owner

Trac:
Owner: N/A to mikeperry
Status: needs_revision to assigned

Trac:
Status: assigned to needs_revision

Trac:
Keywords: guard-discovery deleted, guard-discovery-prop247-controller added
Milestone: Tor: 0.3.2.x-final to Tor: 0.3.3.x-final

Replying to asn:

Thanks for the patch! This seems important! A few comments:

I don't understand how the mods to circuit_timeout_want_to_count_circ() helps us achieve the goal of this ticket. It kinda seems like we are restricting that function even more (doc change implies that too), when we actually want it to be more accepting. I think a nice comment is required for people like me to understand how that function works wrt HS circs.

Ok, I will add to the comment to explain this. Basically we're not making it more restrictive. We are changing it from considering only exactly three hop circuits to considering any circuit that has currently less than or equal to 3 hops built (and plans to build at least 3 hops, but maybe more).

I also don't understand why the circuit_build_times_decide_to_count_circ() logic was moved from one function to another. I think we need some help here to understand, since I'm not familiar with the CBT system at all.

Actually, what happened here was that a bunch of code that was dangling off of circuit_build_no_more_hops() got its own function that does only one thing (decides to count the circuit). circuit_send_next_onion_skin() now calls that function before we decide if we're done adding hops (so we can calculate timeout info at the point where the circuit reaches 3 hops).

I think circuit_get_cpath_len() should be moded instead of adding another func circuit_get_cpath_opened_len(). We can add an arg only_count_opened which does that. I think that'd be cleaner.

In terms of code readability, code quality, and testability, I really disagree here. Combining these two would make for a really ugly conditional in the loop, which will take much more effort to verify for correctness than two separate functions. Also, it's not like binary size matters for us for the extra copy of the loop...

circuit_build_times_decide_to_count_circ() is undocumented and I'd actually like to know what it does.

Ok I will add a comment here, too.

Ok, patches for this and legacy/trac#23114 (moved) are in mikeperry/bugs23100+23114-cleanup-squashed (head cf6c5cc04c3f4e8024409f9a02d4285a804b6c4e). I updated the comments and wrote tests. This and the child are ready for review.

Trac:
Status: needs_revision to needs_review

Thanks Mike! You have an oniongit.eu account because this ain't small amount of code and would be much better to do it there? If not, we can push it there and start the review but you'll need eventually to get an account so you can push your fixups.

Trac:
Keywords: N/A deleted, review-group-25 added

Trac:
Reviewer: N/A to asn

Ok I made a merge request on oniongit at https://oniongit.eu/network/tor/merge_requests/5. I hope that's the right way to do this.

Initial review on oniongit. I'm mainly concerned about the lack of testability and some parts of the code which I don't understand.

While testing this I found legacy/trac#24228 (moved) which seems to be quite important right now.

Trac:
Status: needs_review to needs_revision

Replied on gitlab. As for testability, there is a separate testing commit that tests the new functionality. Based on your gitlab comments, it sounded like you didn't see it? It tests one of the functions you asked for tests for.

Replying to mikeperry:

Replied on gitlab. As for testability, there is a separate testing commit that tests the new functionality. Based on your gitlab comments, it sounded like you didn't see it? It tests one of the functions you asked for tests for.

Thanks for the quick replies. I only looked at 42e4b3e18 since I thought that this was the commit for this ticket. As I said in legacy/trac#23114 (moved), I'm not used to a single branch for multiple tickets. I will review the extra commit and also reply to the gitlab comments next week.

OK addressed all comments on gitlab. I think with a few small improvements, and a clean squashed branch, we can move to merge_ready this.

Ok. I pushed fixups for all of asn's comments. We also had some discussion about refactoring, so I did that in the commit for legacy/trac#23114 (moved).

I prefer this and the child ticket (legacy/trac#23114 (moved)) to be merged together, since the tests were written to cover both. The new merge request is https://oniongit.eu/network/tor/merge_requests/6.

The old branch with all of the fixups is still bugs23100+23114-cleanup-squashed. The new one is bugs23100+23114-cleanup-squashed2. Both heads are identical.

Trac:
Status: needs_revision to needs_review

Hey Mike,

I don't like the way this and legacy/trac#23114 (moved) are being developed. I'm trying to review your code and I feel like you are not respecting my efforts. I have asked you three times to decouple this from the legacy/trac#23114 (moved) branch and you have still not done it, and not only that but you are now merging fixes of legacy/trac#23100 (moved) into the legacy/trac#23114 (moved) commits... I think you are underestimating the time it takes me to review your code and you are making it much harder for me to ACK it. It took me more than a full day to work out the details of legacy/trac#23100 (moved) and it's gonna take me more time to do legacy/trac#23114 (moved), so by having those two branches together you are delaying the whole thing. I'd prefer to see legacy/trac#23100 (moved) get merged, before I jump into reviewing legacy/trac#23114 (moved), or maybe I'd even prefer if someone else reviewed legacy/trac#23114 (moved) since it's non-trivial amount of reviewing.

I'm not familiar with the CBT code and reviewing it sure ain't easy, so please try to make it easy for me. Also decoupling tickets and branches from each other is standard procedure and it's how Tor development is being done.

Or maybe I'm having a bad day :)

Thanks!

Trac:
Status: needs_review to needs_revision

asn - I am trying to honor your many requests for things that were either already done, redundant to other code, and/or difficult to deal with wrt git history.

Since these two tickets touch the very same block of code, I see them as a unit. Maybe I should have simply filed only one ticket... Separating them is very time consuming, especially wrt tests (which did not exist for any of this code before, so I ended up having to write tests that covered a lot of timeout functionality - which was also changing in both of these tickets). If you want me to separate the tests, and refactor separately, and do the copy-code-first thing (even though in my eyes, that is what #213100 mostly was, aside from one extra conditional and a relocation), that will take me another two days of git juggling, test re-writing, and verification, at least.

I spent all day yesterday trying to merge the fixups for legacy/trac#23100 (moved) down without conflicts for legacy/trac#23114 (moved). Other than the refactoring you asked for, in fact all of those fixups are cleanly merged into legacy/trac#23100 (moved). I even provided you two versions of the branch (pre-fixup and post-fixup) to verify this.

Wrt refactoring: because you requested that the initial commit just be a move of code with minimal changes, I merged the refactoring that you asked for into the following ticket, legacy/trac#23114 (moved). I think the refactoring was a good idea. But to avoid making you review a completely different legacy/trac#23100 (moved), I did it in legacy/trac#23114 (moved) instead... It is frustrating to me that you are now upset by this effort.

I think we are both underestimating what we're asking of the other, and communicating poorly on top of it. :/

I can make a separate commit for this, but I think it is way too much for you to also ask for me to separate the tests for this and legacy/trac#23114 (moved) just for git history. And I'm not sure if you are even asking for this. That is still not clear.

If you want to merge this without tests, that's fine. I will make a branch for the legacy/trac#23100 (moved) commit in a second, and I will also make a reference branch with only the fixups it contains. But if you want tests with this code, again, I would rather simply wait for someone to review legacy/trac#23114 (moved). As I said, the tests cover a lot of timeout functionality that is being changed by these tickets. Writing tests for each ticket means changing the tests after each ticket, which is a lot of busy work for no gain in the end.

Ok, I separated bug legacy/trac#23100 (moved) without the tests. mikeperry/bug23100-squashed is just that (and is the same commit hash for legacy/trac#23100 (moved) from bugs23100+23114-cleanup-squashed2). mikeperry/bug23100-presquashed-fixups contains just the fixups in that commit.

I think I also see where you got confused with the rebase. I accidentally labeled the refactoring commits as fixups to legacy/trac#23100 (moved), since you asked for them in that review. I simply removed those commits from mikeperry/bug23100-presquashed-fixups now. They will be part of legacy/trac#23114 (moved).

I will annotate the gitlab review with the fixup commit hashes for your comments.

As I said on IRC, for further background: The reason why these tickets are functionally together (and touch the same block of code) is that both changes are needed for the rate of circuit timeouts for hidden service circuits (and regular circuits) to be at the target 20%. If we only merge this ticket, the resulting tor still will timeout circuits at some other rate. This is why in my mind they are a package deal. Tor won't exhibit the designed behavior without both fixes together.

I filed the tickets separately as I began noticing problems while writing the prop247 simulator. In retrospect, I should have filed a single ticket: "Tor does not time out circuits at the target 20% rate".

There is now an extra fixup commit on top of both mikeperry/bug23100-squashed and mikeperry/bug23100-presquashed-fixups that refactors the function that both of these tickets are changing. This fixup commit does not change any functionality, it only moves code. legacy/trac#23114 (moved) will be rebased on top of this fixup.