When excluding nodes by country, exclude {??} and {A1} too

This is ticket 7706, reported by "bugcatcher."  The rationale here
is that if somebody says 'ExcludeNodes {tv}', then they probably
don't just want to block definitely Tuvaluan nodes: they also want
to block nodes that have unknown country, since for all they know
such nodes are also in Tuvalu.

This behavior is controlled by a new GeoIPExcludeUnknown autobool
option.  With the default (auto) setting, we exclude ?? and A1 if
any country is excluded.  If the option is 1, we add ?? and A1
unconditionally; if the option is 0, we never add them.

(Right now our geoip file doesn't actually seem to include A1: I'm
including it here in case it comes back.)

This feature only takes effect if you have a GeoIP file.  Otherwise
you'd be excluding every node.
o Minor features:
- When any country code is listed in ExcludeNodes or
ExcludeExitNodes, and we have GeoIP information, also exclude
all nodes with unknown countries ({??} and {A1} if
present). This behavior is controlled by the new
GeoIPExcludeUnknown option: you can make such nodes always
excluded with 'GeoIPExcludeUnknown 1', and disable the feature
with 'GeoIPExcludeUnknown 0'. Setting 'GeoIPExcludeUnknown auto'
gets you the default behavior. Implements feature 7706.
......@@ -689,6 +689,14 @@ The following options are useful only for clients (that is, if
node listed in ExcludeNodes is automatically considered to be part of this
list too. See also the caveats on the "ExitNodes" option below.
**GeoIPExcludeUnknown** **0**|**1**|**auto**::
If this option is set to 'auto', then whenever any country code is set in
ExcludeNodes or ExcludeEntryNodes, all nodes with unknown country (?? and
possibly A1) are treated as excluded as well. If this option is set to
'1', then all unknown countries are treated as excluded in ExcludeNodes
and ExcludeEntryNodes. This option has no effect when a GeoIP file isn't
configured or can't be found. (Default: auto)
**ExitNodes** __node__,__node__,__...__::
A list of identity fingerprints, nicknames, country codes and address
patterns of nodes to use as exit node---that is, a
......@@ -242,6 +242,7 @@ static config_var_t option_vars_[] = {
V(FetchHidServDescriptors, BOOL, "1"),
V(FetchUselessDescriptors, BOOL, "0"),
V(FetchV2Networkstatus, BOOL, "0"),
V(GeoIPExcludeUnknown, AUTOBOOL, "auto"),
#ifdef _WIN32
V(GeoIPFile, FILENAME, "<default>"),
V(GeoIPv6File, FILENAME, "<default>"),
......@@ -1567,6 +1568,18 @@ options_act(const or_options_t *old_options)
config_maybe_load_geoip_files_(options, old_options);
if (geoip_is_loaded(AF_INET) && options->GeoIPExcludeUnknown) {
/* ExcludeUnknown is true or "auto" */
const int is_auto = options->GeoIPExcludeUnknown == -1;
int changed;
changed = routerset_add_unknown_ccs(&options->ExcludeNodes, is_auto);
changed += routerset_add_unknown_ccs(&options->ExcludeExitNodes, is_auto);
if (changed)
routerset_add_unknown_ccs(&options->ExcludeExitNodesUnion_, is_auto);
if (options->CellStatistics || options->DirReqStatistics ||
options->EntryStatistics || options->ExitPortStatistics ||
options->ConnDirectionStatistics ||
......@@ -3840,6 +3840,11 @@ typedef struct {
char *GeoIPFile;
char *GeoIPv6File;
/** Autobool: if auto, then any attempt to Exclude{Exit,}Nodes a particular
* country code will exclude all nodes in ?? and A1. If true, all nodes in
* ?? and A1 are excluded. Has no effect if we don't know any GeoIP data. */
int GeoIPExcludeUnknown;
/** If true, SIGHUP should reload the torrc. Sometimes controllers want
* to make this false. */
int ReloadTorrcOnSIGHUP;
......@@ -226,6 +226,45 @@ routerset_contains(const routerset_t *set, const tor_addr_t *addr,
return 0;
/** If *<b>setp</b> includes at least one country code, or if
* <b>only_some_cc_set</b> is 0, add the ?? and A1 country codes to
* *<b>setp</b>, creating it as needed. Return true iff *<b>setp</b> changed.
routerset_add_unknown_ccs(routerset_t **setp, int only_if_some_cc_set)
routerset_t *set;
int add_unknown, add_a1;
if (only_if_some_cc_set) {
if (!*setp || smartlist_len((*setp)->country_names) == 0)
return 0;
if (!*setp)
*setp = routerset_new();
set = *setp;
add_unknown = ! smartlist_contains_string_case(set->country_names, "??") &&
geoip_get_country("??") >= 0;
add_a1 = ! smartlist_contains_string_case(set->country_names, "a1") &&
geoip_get_country("A1") >= 0;
if (add_unknown) {
smartlist_add(set->country_names, tor_strdup("??"));
smartlist_add(set->list, tor_strdup("{??}"));
if (add_a1) {
smartlist_add(set->country_names, tor_strdup("a1"));
smartlist_add(set->country_names, tor_strdup("{a1}"));
if (add_unknown || add_a1) {
return 1;
return 0;
/** Return true iff we can tell that <b>ei</b> is a member of <b>set</b>. */
routerset_contains_extendinfo(const routerset_t *set, const extend_info_t *ei)
......@@ -31,6 +31,7 @@ int routerset_contains_node(const routerset_t *set, const node_t *node);
void routerset_get_all_nodes(smartlist_t *out, const routerset_t *routerset,
const routerset_t *excludeset,
int running_only);
int routerset_add_unknown_ccs(routerset_t **setp, int only_if_some_cc_set);
#if 0
void routersets_get_node_disjunction(smartlist_t *target,
const smartlist_t *source,
