DirMgr: Revise error handling to better tolerate reset-able failures
Previously we had only two kinds of errors that the State
implementations could return: fatal and nonfatal. The fatal ones aborted the bootstrapping process, whereas the nonfatal ones were ignored.
That's not so great, for two reasons:
- Some nonfatal errors are a fine reason to stop using a given directory cache. (If we treat these as nonfatal, we just keep using the same cache over and over.)
- Some nonfatal errors mean that we should restart the bootstrapping process with a new consensus download attempt. (If we treat these as fatal, we abort the process when it can actually continue.)
The motivating examples here are issue #438 (closed), #439 (closed), and #440: for all of them, we need a way to say, in the certificate-downloading stage, that the consensus that we're trying to validate was no good and we need to get a new one.
This branch addresses these issues with two main design changes:
- A
State
can now report a "blocking error" that means that the bootstrapping process needs to reset. - The
State::add_from_{cache,download}
functions can now report nonfatal errors.
There are also several specific places that needed changing to get the right behavior here, and tamp down a few other errors I found along the way.
Closes #439 (closed).
Part of #329 (closed).
Assigning review to @eta since she's stared into this particular abyss in the past.