Tor Authorities reachability testing is predictable and sequential

changed milestone to %Tor: unspecified

added component::core tor/tor milestone::Tor: unspecified priority::medium reachable resolution::wontfix rotation severity::normal status::closed testing tor-dirauth type::defect version::tor 0.2.6.1-alpha labels

Fixed as part of #13718 (moved). Composing commits over the next week.

Trac:
Owner: N/A to teor
Status: new to assigned

These changes to tor are included in commits in:

Bugs: #13718 (moved), #13814 (moved), maybe #13787 (moved), #13839 (moved), #13924 (moved), #13823 (moved), #13929 (moved), #13963 (moved), #13928 (moved) Branch: bug13718-bug13928-randomise-reachability Note: There are 5 branches that start with bug13718, please choose the right one. Repository: https://github.com/teor2345/tor.git

This is a highly experimental change, and requires proper unit testing to ensure it is functioning as designed. I recommend most people use the change in #13718 (moved) instead:

Bugs: #13718 (moved), #13814 (moved), maybe #13787 (moved), #13839 (moved), #13924 (moved), #13823 (moved), #13929 (moved), #13963 (moved) Branch: bug13718-fast-bootstrap Note: There are 5 branches that start with bug13718, please choose the right one. Repository: https://github.com/teor2345/tor.git

Trac:
Status: assigned to needs_review

I am now revising #13929 (moved) & #13928 (moved) separately from #13718 (moved), as this is particularly authority-related.

See #13929 (moved) for more extensive comments.

This needs some comprehensive tests (check each unique sequence covers all relays, before repeating any relays), and therefore probably needs some refactoring to be more testable.

The WIP branch for tor is:

Bugs: #13929 (moved), #13928 (moved) (in that order) Branch: faster-permuted-reachability Repository: https://github.com/teor2345/tor.git

Trac:
Status: needs_review to needs_revision

We will now track this as part of #14034 (moved) Make TestingDirAuthVoteGuard, TestingDirAuthVoteExit and AssumeReachable less essential in test networks.

Trac:
Parent: #13718 (moved) to #14034 (moved)

Let's have a look at this in 0.2.6, maybe.

Trac:
Milestone: N/A to Tor: 0.2.6.x-final

Nick, I've created an experimental branch for the changes in #13929 (moved) & #13928 (moved), controlled using the macros:

DIRSERV_SCALE_REACHABILITY - #13929 (moved)
DIRSERV_PERMUTE_REACHABILITY - #13928 (moved) Using these macros makes it easy for me to tell the difference between refactored code and new code. They will need to be permantently activated, or turned into torrc options, in the final version.

Also, in order to do unit testing, I have created functions that can be set to a fixed value for testing. These functions can be mocked by passing them any value other than DSV_STD (DSV_STD produces the standard, non-testing behaviour).

This probably needs to be converted to the tinytest way, once I get my head around it.

I'm not expecting feedback any time soon - merging and alpha-testing #13718 (moved) is a higher priority, and I also need to polish the fixes in #13192 (moved) before going another round with tor authorities.

Commits: Refactoring, Unit Testing, #13929 (moved), #13928 (moved) (in that order) Branch: experimental-auth-reachability Repository: https://github.com/teor2345/tor.git

Trac:
Status: needs_revision to needs_review

Note: passing DSV_STD (-1) as an int, checking the value, then assigning it to a uint8_t is error-prone. I need to mock these functions properly using MOCK_DECL from tinytest.

reviewing...

065bb7be0309535ae298eb99a5452b3620d4c018:

* In dirserv_reachability_modulo_per_test and dirserv_reachability_initial_group, I'm not thrilled with having this be a uint8_t but take an int, and having most possible values of int getting silently turned into something out of range.

I'd prefer separate getter and setter functions; that's what we do in most other places.

* Maybe "n_relay_testing_groups" or something would be a better name than "modulo_per_test"? Modulo as a noun doesn't mean much here.

4e6789aef56e71a9ab5ba6c983ffd757147d8dd7:

Should probably get tests to ensure that nothing is missed or double-hit.
This isn't exactly a random permutation. It's predictable and observable once you see a couple of values. Would using smartlist_shuffle() to get an actual random permutation be of any use here? (If not, we should probably use another word besides "permutation")

Trac:
Status: needs_review to needs_revision

Yes, I think smartlist_shuffle() would be a much better alternative. The same tests and speedup code would still apply, so only 4e6789aef56e71a9ab5ba6c983ffd757147d8dd7 would need to be modified.

This would make me very happy, because it would cut out much of the far-too-complicated random start/step code in favour of a static array of 128 (or fewer if scaled) uint8 values.

Nick, did you mean shuffle the entire list of relays? Otherwise, the relay testing would still be predictable inasmuch that relays with the same 7-bits in the first byte would be tested together. I'm not sure if that matters that much.

Maybe 0.2.7

Trac:
Milestone: Tor: 0.2.6.x-final to Tor: 0.2.7.x-final

This didn't pass triage, but since it's in needs_revision, I'm just marking it deferrable and leaving it in the milestone. It can't hurt, can it?

My rationale for redlining it during triage was that the best place to do unpredictable-order testing is probably in a successor to torflow.

Trac:
Keywords: tor-auth deleted, tor-auth 027-deferrable added

Marking triaged-out items from first round of 0.2.7 triage.

Trac:
Keywords: tor-auth 027-deferrable deleted, 027-deferrable, 027-triaged-1-out, tor-auth added

Move most 0.2.7-triaged-1-out needs_revision items into 0.2.???. Keep a few based on my sense of the sensible.

Trac:
Milestone: Tor: 0.2.7.x-final to Tor: 0.2.???

#13928 (moved) is a solution to a problem I don't think exists (authority reachability testing being predictable). And making it random could cause other hard-to-diagnose bugs.

Deferring it for later for now.

Trac:
Status: needs_revision to needs_information

No longer depends on the parent

Trac:
Parent: #14034 (moved) to N/A

Milestone renamed

Trac:
Milestone: Tor: 0.2.??? to Tor: 0.3.???

Finally admitting that 0.3.??? was a euphemism for Tor: unspecified all along.

Trac:
Milestone: Tor: 0.3.??? to Tor: unspecified
Keywords: N/A deleted, tor-03-unspecified-201612 added

Remove an old triaging keyword.

Trac:
Keywords: tor-03-unspecified-201612 deleted, N/A added

Trac:
Keywords: N/A deleted, 027-triaged-in added

Trac:
Keywords: 027-triaged-in deleted, N/A added

Trac:
Keywords: 027-triaged-1-out deleted, N/A added

Turns out that tor-auth is for directory authority so make it clearer with tor-dirauth

Trac:
Keywords: tor-auth deleted, tor-dirauth added

Trac:
Keywords: 027-deferrable deleted, N/A added

Trac:
Sponsor: N/A to N/A
Reviewer: N/A to N/A
Keywords: tor-dirauth deleted, tor-dirauth reachable testing rotation added
Severity: N/A to Normal

Disowning tickets I don't intend to work on in the next 6 months.

Trac:
Owner: teor to N/A
Status: needs_information to assigned

Mark all tickets that are assigned to nobody as "new".

Trac:
Status: assigned to new

Replying to teor:

#13928 (moved) is a solution to a problem I don't think exists (authority reachability testing being predictable). And making it random could cause other hard-to-diagnose bugs.

That sounds convincing to me. Should we close?

Sounds good to me.

Trac:
Status: new to closed
Resolution: N/A to wontfix

closed

mentioned in issue #13929 (moved)

moved to tpo/core/tor#13928 (closed)

Tor Authorities reachability testing is predictable and sequential

Child items 0

Activity