This is a tentative proposal to modify the downloader algorithm to take into account that authorities reply with 0 bytes responses to descriptors requests.
Currently collector picks 2 authorities at random order to download descriptors from. When a download fails with a certain error, moves to the next authority.
In this proposals, the downloader classes are modify to handle http returned codes.
Previously, all http returned code except 200 would just return an empty response. Now collector handles the 404 error code corresponding to the "severs unavailable" error message and throws an exception. If the returned code is 200 collector continues to parse the response in a byte array like before. Finally, if another error code is returned, collector assumes that it should retry the download. This takes into account anti-DOS measures implemented by directory authorities.
Furthermore the downloader algorithm has been modified to retry a download that returns a 0 bytes response. The link is retried between 1 and 5 times and a waiting time is added to each try.
Finally, given the current amount of rejection on the network and that most missing descriptors come from votes, I think it makes sense to raise the warning threshold to 4.999% of the consensus weight.
I also propose to exclude gabelmoo and moria from the downloader as we fetch cached descriptors from these directories. For different reasons I would also exclude maatuska since it just times out requests from clients.
This would make the current list of authorities that the downloader would use:
Please note that this is not reflected in the code but on a properties file on the server.
With the proposed changes, the amount of missing descriptors collector complains about is between 0.2% and 4%.