Don't set router is_running=false after intentionally closing a directory connection
In a testing tor network with a few relays, clients, and an onion service, the onion service will call
run_upload_descriptor_event() periodically to upload its service descriptor. This eventually calls
directory_initiate_request() which creates a new dir connection for the upload. In the future when
run_upload_descriptor_event() runs again, it will first call
close_directory_connections() to mark for close any existing/incomplete descriptor uploads for that service. Later (on the next 1 second libevent timer) the dir connection will be closed and since the upload didn't finish,
connection_dir_client_request_failed() will set the router's
is_running field to false.
The problem is that
run_upload_descriptor_event() can run shortly after a previous run, and in a shadow simulation, this can run only 2 seconds after a previous run. If the descriptor upload has not finished in this 2 seconds, the router will be marked as not running and will not be added to the routerlist when building circuits. Since this dir client request can fail often due to tor's new circuit timeout learning, in small tor networks we quickly run out of nodes in the routerlist, and end up with:
Jan 01 00:17:50.000 [info] compute_weighted_bandwidths(): Empty routerlist passed in to consensus weight node selection for rule weight as middle node Jan 01 00:17:50.000 [info] router_choose_random_node(): We couldn't find any live, stable routers; falling back to list of all routers. Jan 01 00:17:50.000 [info] compute_weighted_bandwidths(): Empty routerlist passed in to consensus weight node selection for rule weight as middle node Jan 01 00:17:50.000 [warn] No available nodes when trying to choose node. Failing. Jan 01 00:17:50.000 [info] pick_needed_intro_points(): Unable to find a suitable node to be an introduction point for service r4aj4kaqf46mala2yykldkvwrrwjagab2qppuqtvgdxwh6spsulwu2qd.
I propose to not mark a router as "not running" if tor intentionally closes a directory connection (except maybe for a