We don't consider the Guard flag when picking a directory guard
Back in Tor 0.2.3, when we wanted a new guard, we ended up calling
node = choose_good_entry_server(CIRCUIT_PURPOSE_C_GENERAL, NULL);
which called
choice = router_choose_random_node(excluded, options->ExcludeNodes, flags);
with flags that included CRN_NEED_GUARD, meaning that inside router_choose_random_node(), when it called
router_add_running_nodes_to_smartlist(sl, allow_invalid,
need_uptime, need_capacity,
need_guard, need_desc);
we would check
if (node_is_unreliable(node, need_uptime, need_capacity, need_guard))
continue;
In sum, we would avoid considering relays that didn't have the Guard flag, when picking a new guard.
However, starting with Tor 0.2.4, the logic inside add_an_entry_guard() is now:
} else if (!for_directory) {
node = choose_good_entry_server(CIRCUIT_PURPOSE_C_GENERAL, NULL);
if (!node)
return NULL;
} else {
const routerstatus_t *rs;
rs = router_pick_directory_server(MICRODESC_DIRINFO|V3_DIRINFO,
PDS_FOR_GUARD);
[...]
}
You would think router_pick_directory_server() would look at PDS_FOR_GUARD and choose to avoid relays that don't have the Guard flag. Alas, the only use of for_guard in this function is:
if (for_guard && node->using_as_guard)
continue; /* Don't make the same node a guard twice. */
I think we need to add another clause into this "if it's unsuitable, continue" list inside router_pick_directory_server_impl().
Right now this bug bites us because the first thing a bootstrapping Tor client does is fetch dir info, so that's when it notices it doesn't have enough guards, so the first three guards it picks are picked via router_pick_directory_server(), and then later when we want to use a guard for a normal circuit, we already find that we have workable guards.
In particular, the bug went in with commit 0c4210f (which looks like it went into 0.2.4.8-alpha).
Now, a separate issue here is that Nick and I were both under the impression that a client will avoid using one of its guards if it loses the Guard flag. That appears to not be happening here either. I'll leave that for a separate bug. :)
Thanks to Mohsen Imani for discovering and reporting!