Enumerate possible failure cases and include failure information in .tpf output

Our current model for distinguishing failures, timeouts, and successes is rather simple/arbitrary/confusing:

Timeout: We count any measurement with DIDTIMEOUT=1 and/or with DATACOMPLETE<1 as timeout.
Failure: We count any measurement that doesn't have DIDTIMEOUT=1 and that has DATACOMPLETE>=1 and READBYTES<FILESIZE as failure.
Success: We count everything else as success.

We're plotting timeouts and failures here. This is not as useful as it could be.

It would be so much better to enumerate all possible failure cases and include failure information in .tpf output files. Examples:

Turns out that long-running tor instances sometimes fail to keep up-to-date directory information (#29743 (moved)), and as a result OnionPerf cannot make measurements.
Sometimes streams are closed with reason TIMEOUT or DESTROY (pages 1 and 2 of the first attachment on #29744 (moved)), and I bet there are several subcases in each of these.
Regarding timeouts, it does happen that streams are closed by OnionPerf (pages 3 and 4 of that same attachment on #29744 (moved)).
There are likely more failure cases that might be less frequent that I either did not include them in the #29744 (moved) graphs or did not even run into them at all in the logs I looked at.

Can we enumerate all or at least the most typical failure cases and define criteria for clearly distinguishing them from each other and from timeouts and from successes?

Can we also try to unambiguously identify these failure cases in existing tor/torctl/tgen logs that we process for .tpf files, so that we could include failure case IDs for them in the .tpf files?