dir auths should publish their wfu / tk / mtbf for each relay in each vote?
I've just started running this patch on moria1:
diff --git a/src/feature/nodelist/fmt_routerstatus.c b/src/feature/nodelist/fmt_ routerstatus.c index 252b2e61fe..660d9e5430 100644 --- a/src/feature/nodelist/fmt_routerstatus.c +++ b/src/feature/nodelist/fmt_routerstatus.c @@ -21,6 +21,8 @@ #include "feature/nodelist/routerinfo_st.h" #include "feature/nodelist/vote_routerstatus_st.h" +#include "feature/stats/rephist.h" + #include "lib/crypt_ops/crypto_format.h" /** Helper: write the router-status information in <b>rs</b> into a newly @@ -194,6 +196,13 @@ routerstatus_format_entry(const routerstatus_t *rs, const char *version, digest256_to_base64(ed_b64, (const char*)vrs->ed25519_id); smartlist_add_asprintf(chunks, "id ed25519 %s\n", ed_b64); } + time_t now = time(NULL); + long tk = rep_hist_get_weighted_time_known(rs->identity_digest, now); + double wfu = + rep_hist_get_weighted_fractional_uptime(rs->identity_digest, now); + double mtbf = rep_hist_get_stability(rs->identity_digest, now); + smartlist_add_asprintf(chunks, "stats wfu=%.3f tk=%lu mtbf=%.3f\n", + wfu, tk, mtbf); } }
where the goal is to make it explicit what moria1 thinks of each relay, to make it possible for relay operators to figure out why it votes the way it does, and to make it possible for network-wide debugging to find bugs in our tk calculations.
You can see it in action at
Is this something we want to upstream? Seems like maybe yes, on the theory that I wrote the patch and the transparency could be useful?
Some small improvements to consider:
in the flag-thresholds line we print mtbf as a %lu even though it is a double, so maybe we want to make it a lu here too, since the numbers are big so decimal precision is kind of silly
maybe we want a better paint color than "stats" for this line.
I tried to make the numbers that it publishes as close as possible to the numbers that the dir auths make their decisions on, to minimize the class of bugs where we still make flag decisions in weird ways but we can't figure out why from these published numbers.