7802 tweaks per code review
7802 accidentally got merged with 8024 before being reviewed; joint code review by nickm and myself followed. Nothing so objectionable as to merit reverting the merge, but a few points noted (full IRC transcript will be attached):
18:45 < athena> okay, in circuitbuild.c we've got some pretty straightforward functions to query parameters that look
just fine, then a long block of code which is probably where we should be focusing most attention
18:45 < nickm> Right. The "rate" configuration things are all percentages, but I don't see anything that keeps you from
configuring a value below zero or above 100 in config.c. We should note that. NOTE.
18:45 < nickm> (I'm going to grep for NOTE strings.)
18:45 < athena> okay, yeah, those should be checked and clamped, i think
***
18:48 < nickm> I don't like the name "pathbias_check_use_rate" -- "check" is pretty vague. NOTE.
18:48 < nickm> The return; at the end of that function offends me. NOTE.
18:48 < athena> yeah
18:49 < nickm> I think the comment for the next function (pathbias_mark_use_success) may be wrong; the "stat" in the
first sentence should probably be 'state'. and the word "mark" should appear before "it".
18:49 < nickm> NOTE.
***
18:53 < nickm> ooh. In mark_use_success, we check path_state < PATH_STATE_USE_ATTEMPTED. We should make sure that a
comment on path_state_t says that its values must never be reordered or else stuff might break. NOTE
18:54 < athena> did you mean to say pathbias_should_count? i'm not seeing a pathbias_should_close
18:54 < nickm> sorry, should_count
18:55 < athena> okay
18:55 < athena> looks like it ignores circuits that have some purposes, that are one-hop or checks some config options
18:55 < athena> do circuit purposes ever change during the lifetime of a circuit
18:55 < athena> ?
18:56 < nickm> I believe circuit purposes can change under some circumstances, though the rules are nontrivial
18:56 < athena> hmm
18:56 < nickm> maybe we should cache the result of this function if we've ever called it before?
18:56 < nickm> maybe we should warn if it changes?
18:57 < nickm> this isn't a new problem, if it's a problem
18:57 < athena> yeah, we need a detection for that if we can't be sure it doesn't happen
18:57 < nickm> NOTE. Let's ask mike?
***
19:04 < athena> well, the code moves the pathbias_send_usable_probe() to USE_ATTEMPTED; it seems nontrivial to evaluate
the effects of doing that
19:05 < athena> let's note that and move on, and ask mike about it when he's around
19:05 < nickm> right.
19:05 < athena> probably means we should settle on 'revert' for now if there's something like that in there that we
don't understand
19:06 < nickm> oh wait
19:06 < athena> ?
19:06 < nickm> read the description of 7802 again.
19:06 < nickm> he's splitting PATH_STATE_BUILD_SUCCEEDED into two states, where previously the distinction was not
based on state but based on timestamp_dirty
19:07 < athena> right
19:07 < athena> and the old probe did test timestamp_dirty
19:07 < athena> okay, this one's fine, then
19:07 < nickm> so that block that moved isn't the build_succeeded block; it's the build_succeeded && timestamp_dirty
block
19:07 < nickm> great
19:07 < nickm> and it now sets the state to ALREADY_COUNTED so we don't do this again. Sounds good.
19:08 < nickm> next two changes are function renames.
19:08 < athena> yeah
19:08 < nickm> Then the function now known as pathbias_get_close_success_count.
19:08 < nickm> Hm. path_state >= USE_FAILED and path_state >= BUILD_SUCCEEDED in the same function. I wish those were
macros or inline functions or something.
19:08 < nickm> NOTE above
19:09 < athena> yeah, making assumptions about enums like that always makes me feel icky
19:10 < nickm> (If you add a new value, you need to add it in the right position or there's hell to pay)
19:10 < athena> yeah\
19:11 < nickm> so the change in pathbias_get_closed_count is probably okay though I think.
19:12 < athena> yeah, agree
19:12 < nickm> agree?
19:12 < nickm> great
***
19:18 < nickm> pathbias_act_on_failure_rate?
19:18 < nickm> It doesn't return a boolean; it just messes with stuff and acts accordingly
19:18 < athena> it does, though
19:18 < nickm> oh, it does?
19:18 < athena> it returns -1 if it rejects the guard for a high failure rate
19:18 < nickm> Do we ever look at its return value?
19:18 < athena> or falls to the return 0 at the bottom of the function otherwise
19:18 < athena> dunno
19:19 < nickm> Looks like we call it in 2 places, and never check the return value there.
19:19 < athena> doesn't look like it
19:19 < athena> yeah, your suggestion then
19:19 < nickm> ok. NOTE let;s rename it to pathbias_act_on_failure_rate.
19:20 < athena> should we also get rid of the return value?
19:20 < nickm> Also NOTE that in the case where it sets any persistent value on a guard, it does need to call
entry_guards_changed I think
19:20 < nickm> yes we should. (NOTE)
***
19:21 < nickm> Hm. In the later part of the function that does scaling...
19:22 < nickm> ah, never mind.
19:22 < nickm> that looks okay to me if it does to you, since guard->use_{successes,attempts} are doubles.
19:23 < athena> yeah
19:23 < athena> it's just using a slightly awkward way of passing a rational
19:23 < nickm> We should check whether we enforce that scale_factor is at least 1, and mult_factor is less than or
equal to scale_factor.
19:23 < athena> but i'm not sure why, since if those are doubles we're already using floating point anyway
19:23 < athena> could just be a single function returning another double?
19:23 < nickm> (NOTE)
19:23 < nickm> I think it should be.
19:24 < nickm> I think he's doing it this way because consensus parameters can only be ints
19:24 < nickm> (int32_t, I think)
19:24 < athena> also make sure that we never divide by zero
19:24 < athena> oh, right
19:24 < nickm> Right. (NOTE wrt the refactoring though)
19:24 < athena> so it'd just be a pain to extend that to allow double, then
19:25 < nickm> yeah, but we can have the function return a double, and let the user configure a double themselves if
they want to override it in torrc
19:25 < nickm> (It would be crazy to override the numerator but not the denominator)
19:25 < athena> yeah
19:26 < athena> agreed; should be one function calculating the ratio of two int32_t consensus params *or* pulling a
double out of the config file
19:26 < nickm> Right. Let's do it like that. NOTE
***
19:26 < nickm> on to pathbias_check_close_rate. That should probably get renamed too.
19:27 < athena> pathbias_act_on_close_rate?
19:27 < nickm> yeah. That one _does_ have its return value checked.
19:27 < nickm> (lots of duplicate code here too I think)
19:27 < nickm> Does this look like duplicate code to you? Does a later commit fix that?
19:28 < athena> hold on, lemme look
19:29 < athena> yeah, it's kinda duplicated
19:29 < athena> but it's calling different functions in the log messages (get_use_success_count() vs.
get_close_success_count())
19:30 < athena> getting rid of the duplication would mean passing some function pointers around, probably
19:30 < athena> and i'm not sure where it gets scale_factor from
19:31 < nickm> Or having one block at the top of the function that says if (counting_closed) {
success_count=get_close_success_count() } else {success_count=get_use_success_count();} and avoid the
function pointers
19:31 < athena> yeah, okay
19:31 < athena> then should refactor to have less duplicate code i think
19:32 < nickm> The scaling block is pretty different
19:32 < nickm> isn't scale_factor coming from pathbias_get_scale_factor ?
19:32 < athena> hmm
19:32 < nickm> I hope everything scaled in that block is also a double, or our idea of having one function that returns
a double is going to break.
19:33 < athena> yeah
19:34 < athena> well, we could split the scaling out into separate rescale_<use,close> functions, and handle them like
the success_count thing?
19:35 < nickm> Seems good. Also Given that there are two sets of values that get scaled independently, we should make
sure that their documentation says which block is which, clearly.
19:35 < nickm> NOTE on the refactor and NOTE on the documentation
***
19:41 < nickm> I don't like the check on the return value of pathbias_check_close_rate(). That function only returns
-1 the first time it decides to disable the guard, right?
19:41 < nickm> maybe it should call it, then check whether guard->path_bias_disabled is true? (NOTE)
***
19:45 < nickm> Hm. I wonder what initializes node->use_attempts and node->use_successes and the other doubles if this
part isn't called.
19:46 < nickm> If they all come from a malloc_zero() or a memset, that's a little tricky. Does memset(&dbl, 0,
sizeof(double)) get you a good zero?
19:46 < athena> not on things that aren't IEEE-754 in general
19:47 < nickm> hm. we proably have other places where we do this.
19:47 < athena> yeah
19:47 < athena> it is theoretically a portability issue, but it's gotta be pretty rare to run into a machine like that
19:48 < athena> the only examples of non-754 fp i can think of off-hand are VAX and some older crays
19:48 < nickm> We could treat it like systems where memset(&ptr, 0, sizeof(void*)) doesn't set ptr to NULL.
19:48 < athena> yeah
19:48 < nickm> NOTE. let's do that.
***
19:57 < nickm> athena: this code looks okay to me, but I wonder if it wouldn't be better to change the check to call
one of the pathbias_act_on_foo_rate() functions instead to eliminate duplicated code. What do you think?
19:58 < athena> probably
19:59 < nickm> athena: I think this one isn't super-urgent, but we should still NOTE it. agree?
***
20:06 < athena> 24b9b9f791defcb6156c587a035fde58c35a46d9 next?
20:06 < nickm> yes let's
20:07 < nickm> The two blocks look duplicated, but there are only two
20:08 < athena> yeah
20:08 < athena> it could still be made into two calls to a single duplication-free function, though
20:08 < nickm> agreed.
20:08 < nickm> NOTE
20:08 < athena> a2db17a1aab77c4183f589815800a779a5f24524 ?
20:08 < nickm> hang on. So, if even one stream is detached, we roll back to USE_ATTEMPTED?
20:09 < nickm> What if there are multiple streams and only one is detached?
20:09 < nickm> It doesn't seem too harmful, but we should ask mike IMO.
20:09 < athena> hmm
20:09 < athena> yeah
20:09 < nickm> NOTE let's ask him
***
20:11 < nickm> a2db17a1aab77c4183 ?
20:11 < athena> yeah
20:14 < nickm> doesn't seem crazy. I don't get it though.
20:14 < nickm> I wonder if the comment in circuit_launch_by_extend_info before the line that's being removed is now inaccurate. NOTE
***
20:21 < nickm> so, the next commit does the common code refactoring you mentioned before when it adds
pathbias_count_circs_in_states
20:22 < nickm> also NOTE that the pairs of arguments to pathbias_count_circs_in_states are good things to document in
the documentation for path_state_t
20:22 < athena> yeah, agree
20:22 < nickm> As for the part of this change that changes how scaling happens... how does that look to you?
20:24 < athena> it looks correct to me
20:25 < nickm> I personally would kind of prefer it if we didn't add things to {circ,use}_{attempts,successes} until we
had the corresponding changes in the other scaled variables. then this commit wouldn't be needed.
that's a potentially bigger refactoring though. NOTE it for later?
issue