Idea to make consensus voting more resistant
This is an idea how to improve the current situation, where sometimes a directory authority is slow to get its vote out to the other dirauths, and so the dirauths don't all have the same sets of votes. To simplify, I'm illustrating with an example of three dirauths:
at :50, all dirauths make their vote and start uploading. auth1 and auth2 get their vote to all auths, but auth3 doesn't. it cannot publish a vote to auth1 at all, and it takes more than 2.5 minutes to publish its vote to auth2. at :52:30, all auths try fetching the votes they're missing from the other auths, so auth1 asks auth2 for auth3's vote, and auth2 asks auth1 for auth3's vote. auth3 asks nobody, and nobody asks auth3. At this point, neither auth1 nor auth2 have auth3's vote. auth3 now (at, for example, :53:30) succeeds publishing to auth2, so auth1 now has a vote from auth1 and auth2, auth2 and auth3 have a vote from auth1, auth2, and auth3. At :55 the auths try to make a consensus, but auth1 will end up with a different consensus than auth2 and auth3.
My idea to make this less of a problem would be that only for two minutes will we accept a vote that gets pushed to us, and anything we get later than that is considered "too late" and will be dropped. At :52:30 minutes, we still go ahead and try and fetch all votes from all the other authorities, and if they have a vote we will accept it. We repeat that fetching of all votes that we don't have at 53:00, 53:30, 54:00 and 54:30. That way, a delayed publication of the original vote will not cause this kind of split, where the dirauths have different opinions on who has voted, so only the dirauth that took more than 2 minutes to publish its descriptor to any of the other dirauths will be affected. There's still a race condition here, which is when a dirauth (within two minutes) only publishes to one other dirauth, and then that dirauth gets so slow it cannot get the vote to any of the other votes. But since it was fast enough to get the vote the first time, hopefully that's rather rare.
Does this all sound viable? Am I overlooking something?
Update: This bug was introduced in Tor 0.2.0.5-alpha, with the v3 authority voting code.