I rebased everything onto 0.3.6 and organized the commits in a similar order to how we reviewed in Mexico. Strongly recommend reviewing it in commit order.
WARNING: The github PR UI is showing the commits in timestamp or some other random ordering. Reviewing in commit hash order is best. I have no idea how to change the github ordering. :/
I started doing the review, and I put some general comments in github. I'm far from done.
I also started testing it with chutney and it seems to work pretty nicely. I have some local changes that add some extra logs, but I'm gonna push them after I do more testing.
Mike, you have any tips for more effective testing of this feature?
Nickm -- the branch's commits are in recommended review order. Note that github reorders them by timestamp in its UI, which is not the right order to look at it. The right order is in the git hash-parent history.
The last few days I've been focusing on using Riastradh's better probability distribution formulas as part of circuitpadding. Current work can be found in https://github.com/torproject/tor/pull/533.
Next steps in this area:
Fix the remaining compiler warnings.
~~Figure out if we want to keep the stochastic tests or not. Need to understand more about their failure rates, and also they are quite slow. Maybe I can put them into the slow test category which AFAIK is not run by the CI. We still get to run the numerical tests. ~~Moved stochastic warnings to test-slow. I will open a ticket to figure out what to do about them.
~Address various XXXs (e.g. geometric distr needs to conform to the API as well). ~~Resolved a few of them, the rest we can open tickets for.
Do further documentation.
~~Ensure that the API consumer circpad_distribution_sample() works well in terms of using the prob distributions correctly. ~~Added a basic test for uniform distribution circuit padding delay sampling.
Fix coding style of prob distr code because it's different from Tor's and make check-spaces is freaking out.
The new files were ported to Tor quite hastily. Ensure that they are proper wrt header file etc. Also fix the copyrights etc.
All remaining tasks above have been resolved. I pushed the code in !adaptive_padding-prob-distr-tests and pending review from Riastradh. After review has been done, I will squash into something presentable along with the rest of the #28142 (moved) branch. I will also open some tickets for leftover optional work in this area.
FYI: a part of this work, namely the uniform ^^[0,1^^] sampler used throughout, is echoed in #23061 (moved). Just letting you know in case you want to avoid the duplication sooner rather than later.
use separate functions for floor and ceiling instead of one with a boolean parameter; and
convert assert to tor_assert.
With these patches, the only remaining complaints from make check are:
lines that are too long because they contain URLs that are too long; and
includes that are not allowed because of missing .may_include or something.
I'm not sure what the best way to address these is -- break the URLs, or add an override; update .may_include, or move files around -- so I'll leave that up to you folks to decide.
Disable the current machine so that we don't merge it as on-by-default.
After this we can give it back to Nick for another round of review.
Mike, when you come back can you start crunching through the list of unittests apart from the two I mentioned above? And also perhaps go through the GH PR#461 review for more unanswered points? And also check the pull request above to see the integration of Riastradh's work? Let's also meet in IRC at some point to coordinate further.
Ok I looked at the branches a bit.. One issue is that there are a fair amount of fixups I still need to do for Nick's remaining comments on PR#461.. So I'm trying to decide how to do them. For sanity/cleanliness I think the answer is to squash everything down in PR#547 into a fresh PR, and do the fixups for Nick from PR#461 on top of that branch. There is the question of how to handle the new commits you're adding asn, to make those easy for Nick to review. Some of them look like they could be squashed back into circuitpadding.c. The test ones can probably remain their own commit.
Do you have any other preference for how to do this? Or want to reorg the branch how you prefer? Otherwise I will do that my tomorrow.
Ok I looked at the branches a bit.. One issue is that there are a fair amount of fixups I still need to do for Nick's remaining comments on PR#461.. So I'm trying to decide how to do them. For sanity/cleanliness I think the answer is to squash everything down in PR#547 into a fresh PR, and do the fixups for Nick from PR#461 on top of that branch. There is the question of how to handle the new commits you're adding asn, to make those easy for Nick to review. Some of them look like they could be squashed back into circuitpadding.c. The test ones can probably remain their own commit.
Hey Mike, good to have you back, hope the off days were good.
I agree that the right thing to do is squash up PR#547 and start building from there.
Perhaps for the extra commits that I added we can leave them as is for the purposes of the next nickm review, and squash them into circuitpadding etc. after that review.
For now, I keep on working on testing the remaining remove token functions.
Definitely easier to look at git logs for. Hopefully not too confusing for Nick. And hopefully we don't step on eachother. I have a looot of fixups for circuitpadding.c to do. If you end up modifying that file lmk.
I'm also in the process of making more typedefs and tighter precise int_t usage for the various integer types we use, for clarity and to reduce type confusion/sign errors/overflow risk. That may end up touching some tests. When I get there, I'll let you know. Probably Monday.
The tests cover the remaining removal token functions that were not covered, and also give a bit more coverage to the global rate limiting test (based on coverage analysis).
Hey Mike, I saw your comment on the meeting pad yesterday. I see you have one or two fixups remaining. So let's do the following: Finish with your fixups, push your branch, (update the PR if needed), and we can switch: I will take your branch and start doing fixups based on PR#461, and you can start writing unittests and increasing coverage. Let me know when we can do the switch.
It looks like you've been force-pushing to that PR. The head I used (56376125a42de88bd57240551f7a76f5219fc455) is now gone. and I have conflicts from my fixup (its extensive due to type renaming -- which is why I wanted it done before we switched). It looks like 56376125a42de88bd57240551f7a76f5219fc455 is 4bf714c9355254e788db886b0fd7da8cedae0822.
Might not be able to fix the conflicts until tomorrow. In the future, try not to force push (tho that would not have helped the conflicts.. we just collided changesets).
It looks like you've been force-pushing to that PR. The head I used (56376125a42de88bd57240551f7a76f5219fc455) is now gone. and I have conflicts from my fixup (its extensive due to type renaming -- which is why I wanted it done before we switched). It looks like 56376125a42de88bd57240551f7a76f5219fc455 is 4bf714c9355254e788db886b0fd7da8cedae0822.
Might not be able to fix the conflicts until tomorrow. In the future, try not to force push (tho that would not have helped the conflicts.. we just collided changesets).
Oof sorry about this. I had to rebase to master because some of the probability distribution work was using stuff from master (#28193 (moved)), but I shouldn't have force pushed into the useful PR. Please let me know if there is anything I can do to help you with declogging the conflict resolution.
FWIW, I started doing fixups from PR461. Some of them are quite heavy (e.g. https://github.com/torproject/tor/pull/461#discussion_r231893260) so I'm gonna wait until you address the conflicts and finish up the branch before doing those, to avoid more conflict hell. Then I will start applying the fixups, and you can start doing the tests, and after that I will take care of the conflict resolution between our two branches.
BTW, there are a few non-fixup related questions in PR461 that I would like help with. I will resolve the fixup stuff in PR461 soon, and only leave the questions hanging, so that you can roll through them.
Ok I fixed the errors in PR 574. They were due to return type mismatched, which caused clang to garble the return value of circpad_histogram_usec_to_bin(). This is now running as https://github.com/torproject/tor/pull/575
If you fix your unit tests, please rebase your stuff on top of PR575, or vice-versa?
As I said on IRC, note that https://github.com/torproject/tor/pull/461 still has some fixup comments from nickm that are currently hidden due to being "outdated" (I think we just changed code near them). They still need fixups tho. I also commented on a couple things in that PR.
After fixups, I think next steps for implementation is making a working machine. See my thoughts in https://trac.torproject.org/projects/tor/ticket/28634#comment:3. You can work on that if you like, after fixups. I can go back to working on tests, since you are tired of that.
OK I took Mike's unittest branch from PR#593 and on top of it I applied my fixups. I pushed two branches: an unsquashed one with all the fixups intact, and a squashed one which is completely squashed down to the base. The two branches should have a git-diff of zero.
I also replied to all the open review comments by Nick on PR#461 by pointing to the unsquashed branch. So after the fixups get validated, we can start working on top of the squashed branch which is cleaner.
If everyone likes the fixups and CI blesses the branches, then I think the next step is to disable/kill the current machine so that the branch can get merged and we can start working off master for good!
Let me know what else there is to do. Otherwise, we should make a squashed version of this branch for the final review and merge by Nick. I can do the squash no problem.
This branch is ready to be reviewed & merged. It does not actually enable any machines atm, but let's see if we can fit something in for 040 after we merge this.
Okay, I've been over pr 624. I think it's nearly ready; I just want to double-check a few things with you here. In many cases, just opening a followup ticket and attaching a link should be good enough, though we should talk about "necessary for 0.4.0" vs "can wait".
Okay, I've been over pr 624. I think it's nearly ready; I just want to double-check a few things with you here. In many cases, just opening a followup ticket and attaching a link should be good enough, though we should talk about "necessary for 0.4.0" vs "can wait".
Thanks for the review.
I went through all the new review comments and provided fixups to most of them.
I pushed a few more commits to adaptive_padding-final that Riastradh authored to improve the prob distr codebase (it now does type checking on the distr macros) and fix a bug and some compilation issues.
I'm still unsure how to handle the monotime issue because I'm not sure what's acceptable in terms of timing accuracy and overhead. Mike has some questions on IRC for you Nick.
I've added a FAQ to compat_time.h, answering those questions as best I can with the info I have.
I think it's okay to leave the time stuff unchanged for now, iff it won't be called by default. Otherwise, I think we should define an abstract type to represent "whatever time we're using" so we can switch it easily if we have to.
Hrmm.. For monotime, I think my preference would be some kind of check that helps decide which of these time abstractions to use, for optimal accuracy and speed. I also think that its likely that the choices we make for client circuits will be different than for relays. I think clients can sacrifice more speed for accuracy, since their volumes of traffic will be lower.
As for the TODO, we should drop it and make sure it is captured in the tickets.
For monotime, I think my preference would be some kind of check that helps decide which of these time abstractions to use, for optimal accuracy and speed.
I think that might be possible, but it's not going to happen between now and the freeze on the 15th.
Does the current monotime usage happen in the critical path for default tor configurations with this patch? If no, I think we don't have to block on a new monotoime approach. If yes, maybe we can make it conditional on wtf-pad being enabled.
For monotime, I think my preference would be some kind of check that helps decide which of these time abstractions to use, for optimal accuracy and speed.
I think that might be possible, but it's not going to happen between now and the freeze on the 15th.
Does the current monotime usage happen in the critical path for default tor configurations with this patch? If no, I think we don't have to block on a new monotoime approach. If yes, maybe we can make it conditional on wtf-pad being enabled.
I think we are good here. Because there are currently no machines enabled for any tor configuration, there is no monotime usage happening because of circpadding at all. I verified that by adding log messages to monotime usage and making sure nothing happens from circuitpadding.