Incorrect ancillory data presented on Torflow guard relay "update" vote lines
I realize little appetite exists for correcting Torflow issues at this juncture, but I came across an inexplicable behavior that is either my inability to unravel the code or a bug serious enough to merit fixing even at this stage of the new scanner development.
Investigating the scenario where the current measurement of a Guard relay is disregarded and instead the most recent voted measurement revised, I found this:
first archive appearance of new measurement vote
1530271972 node_id=$4F0DB7E687FC7C0AE55C8F243DA8B0EB27FBF1F2 bw=11500 nick=Binnacle measured_at=1530271632 updated_at=1530271632 pid_error=0.34411159803 pid_error_sum=0.34411159803 pid_bw=11467418 pid_delta=-0.0875088625329 circ_fail=0.0 scanner=/scanner.5/scan-data/bws-51.6:52.4-done-2018-06-29-06:27:12 @type server-descriptor 1.0 router Binnacle 184.108.40.206 443 0 80 published 2018-06-29 00:29:07 fingerprint 4F0D B7E6 87FC 7C0A E55C 8F24 3DA8 B0EB 27FB F1F2 bandwidth 1073741824 1073741824 8531597
first subsequent no-measurement vote update
1530441163 node_id=$4F0DB7E687FC7C0AE55C8F243DA8B0EB27FBF1F2 bw=20100 nick=Binnacle measured_at=1530271632 updated_at=1530439657 pid_error=0.34411159803 pid_error_sum=0.34411159803 pid_bw=11467418 pid_delta=-0.0875088625329 circ_fail=0.0 scanner=/scanner.4/scan-data/bws-38.6:39.4-done-2018-07-01-05:07:37 @type server-descriptor 1.0 router Binnacle 220.127.116.11 443 0 80 published 2018-06-30 12:30:08 fingerprint 4F0D B7E6 87FC 7C0A E55C 8F24 3DA8 B0EB 27FB F1F2 bandwidth 1073741824 1073741824 9458512
reading aggregate.py it appears this case his handled by lines 678-688 with kp=1 ki=0 and kd=0:
# Don't use feedback here, but we might as well use our # new measurement against the previous vote. n.copy_vote(prev_votes.vote_map[n.idhex]) if cs_junk.use_desc_bw: n.new_bw = n.get_pid_bw(prev_votes.vote_map[n.idhex], cs_junk.K_p, cs_junk.K_i, cs_junk.K_d, 0.0, False) def get_pid_bw(self, prev_vote, kp, ki, kd, kidecay, update=True): if not update: return self.use_bw \ + kp*self.use_bw*self.pid_error \ + ki*self.use_bw*self.pid_error_sum \ + kd*self.use_bw*self.pid_delta line 591 n.use_bw = n.desc_bw line 556 prev_votes = VoteSet(argv[-1]) line 205 try: self.pid_error = float(re.search("[\s]*pid_error=([\S]+)[\s]*", line).group(1)) def copy_vote(self, vote): self.new_bw = vote.bw*1000 # Not set yet self.pid_bw = vote.pid_bw # Not set yet self.pid_error_sum = vote.pid_error_sum # Not set yet self.pid_delta = vote.pid_delta # Not set yet
My reading indicates
new_bw is calculated by applying the last vote pid_error to the current descriptor bandwidth. Comment appears misleading, but the logic seems sensible enough.
Problem is that 9458512 + 9458512 * 0.34411159803 is 12713,296 rather than the published value of 20100. If my understanding of the code is correct, the actual behavior here does not reflect the intended behavior and the result is egregiously incorrect. I suspect a bug, but the code is complex and in the absence of debugging a live Torflow instance my analysis could be wrong.
Perhaps Mike Perry could chime in here?