Stop measuring relays without descriptor?
Digging again into Torflow, i found that it doesn't measure the relays without descriptor, but we do it in sbws.
It's in these lines: https://gitweb.torproject.org/pytorctl.git/tree/TorCtl.py#n1079
for ns in nslist: try: r = self.get_router(ns) if r: new.append(r) except ErrorReply: bad_key += 1 if "Running" in ns.flags: plog("INFO", "Running router "+ns.nickname+"=" +ns.idhex+" has no descriptor")
def get_router(self, ns): """Fill in a Router class corresponding to a given NS class""" desc = self.sendAndRecv("GETINFO desc/id/" + ns.idhex + "\r\n") sig_start = desc.find("\nrouter-signature\n")+len("\nrouter-signature\n") fp_base64 = sha1(desc[:sig_start]).digest().encode("base64")[:-2] r = Router.build_from_desc(desc.split("\n"), ns) if fp_base64 != ns.orhash: plog("INFO", "Router descriptor for "+ns.idhex+" does not match ns fingerprint (NS @ "+str(ns.updated)+" vs Desc @ "+str(r.published)+")") return None else: return r
That would explain why we see relays that longclaw reports with weight 1, which don't have observed bandwidth or maybe it's also the ones with 0 observed bandwidth (#40051 (comment 2728756)) and sbws is reporting them earlier than Torflow, which makes sbws to take some more time until next time it measures them (unless we change the sbws "prioritization" and we start measuring the relays without descriptor, instead of by last measured).
There might be the positive side of injecting traffic to them earlier.
If we do as Torflow, then, @mikeperry, we probably don't need FetchDirInfoExtraEarly or FetchUselessDescriptors? (Reminder that Torflow doesn't measures either the relays not in the last consensus, though it reports them in the bandwidth file).
Probably tpo/core/tor#40337 helps debugging this.