Skip to content

CMU and Michigan relays having trouble reaching dir auths

Chapter one of the story

cmutornode (204.194.29.4:9001) has been offline for almost 5 days according to relay-search. But, the relay is still running if you connect to it.

And if you connect to it as a bridge, after a while you get a stream of these log complaints:

Oct 21 20:31:57.202 [warn] Received http status code 404 ("Consensus is too old") from server 204.194.29.4:9001 while fetching consensus directory.

which indicates that it is having trouble fetching a current consensus.

A tcptraceroute to it from an IP address nearby moria1 works:

# tcptraceroute 204.194.29.4 9001
Selected device eno1, address 128.31.0.24, port 37125 for outgoing packets
Tracing the path to 204.194.29.4 on TCP port 9001, 30 hops max
 1  guest.core-1.csail.mit.edu (128.31.0.2)  1.170 ms  0.898 ms  0.829 ms
 2  dmz-rtr-2-csail.mit.edu (18.0.162.141)  0.783 ms  0.694 ms  0.676 ms
 3  dmz-rtr-1-dmz-rtr-2-2.mit.edu (18.0.162.5)  0.621 ms  0.585 ms  0.586 ms
 4  external-rtr-3-dmz-rtr-1.mit.edu (18.0.161.14)  1.076 ms  0.916 ms  0.974 ms
 5  mit-re-nox1sumgw1.nox.org (18.2.4.109)  0.699 ms  1.022 ms  0.602 ms
 6  192.5.89.21  0.749 ms  0.973 ms  0.703 ms
 7  i2-re-nox300gw1.nox.org (192.5.89.222)  7.525 ms  7.077 ms  8.224 ms
 8  fourhundredge-0-0-0-20.4079.core1.newy32aoa.net.internet2.edu (163.253.1.42)  16.669 ms  16.801 ms  18.265 ms
 9  fourhundredge-0-0-0-2.4079.core1.ashb.net.internet2.edu (163.253.1.116)  16.744 ms  16.451 ms  17.531 ms
10  fourhundredge-0-0-0-0.4079.core1.pitt.net.internet2.edu (163.253.1.125)  17.871 ms  17.450 ms  16.740 ms
11  internet2-pitt-rebr-jrt-mi-et-0-0-1-1011.3rox.net (192.88.115.82)  15.596 ms  15.799 ms  15.681 ms
12  rtr-acm.cmu.3rox.net (147.73.16.120)  15.645 ms  15.719 ms  15.670 ms
13  * 192.12.32.20 16.221 ms *
14  CORE2-BORDER-FW.GW.CMU.NET (128.2.5.68)  16.212 ms  16.143 ms  16.317 ms
15  POD-A-CORE2.GW.CMU.NET (128.2.255.154)  16.122 ms  17.894 ms  16.202 ms
16  TOR-EXIT.CYLAB.CMU.EDU (204.194.29.4) [open]  16.548 ms  16.596 ms  16.644 ms

whereas one from moria1 does not work:

# tcptraceroute -s 128.31.0.39 204.194.29.4 9001
Selected device eno1, address 128.31.0.39, port 59161 for outgoing packets
Tracing the path to 204.194.29.4 on TCP port 9001, 30 hops max
 1  guest.core-1.csail.mit.edu (128.31.0.2)  1.370 ms  10.742 ms  1.677 ms
 2  dmz-rtr-2-csail.mit.edu (18.0.162.141)  0.803 ms  0.738 ms  0.599 ms
 3  dmz-rtr-1-dmz-rtr-2-2.mit.edu (18.0.162.5)  0.668 ms  0.621 ms  0.630 ms
 4  external-rtr-3-dmz-rtr-1.mit.edu (18.0.161.14)  0.780 ms  0.884 ms  0.981 ms
 5  mit-re-nox1sumgw1.nox.org (18.2.4.109)  0.939 ms  0.866 ms  0.896 ms
 6  192.5.89.21  0.718 ms  0.707 ms  0.710 ms
 7  i2-re-nox300gw1.nox.org (192.5.89.222)  6.724 ms  8.207 ms  7.183 ms
 8  fourhundredge-0-0-0-21.4079.core1.newy32aoa.net.internet2.edu (163.253.1.44)  16.535 ms  16.637 ms  17.016 ms  
 9  fourhundredge-0-0-0-2.4079.core1.ashb.net.internet2.edu (163.253.1.116)  18.319 ms  18.124 ms  17.553 ms
10  fourhundredge-0-0-0-0.4079.core1.pitt.net.internet2.edu (163.253.1.125)  17.085 ms  18.842 ms  17.961 ms
11  internet2-pitt-rebr-jrt-mi-et-0-0-1-1011.3rox.net (192.88.115.82)  15.598 ms  15.587 ms  15.833 ms
12  * * *
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *

It looks like something in between is preventing the connection?

Chapter two of the story

VictorCryptoUMich (35.0.127.52:9001) has been blinking on and off on relay-search. relay-search says it went down around Oct 13, and then came back around Oct 18, but disappeared again on Oct 20. When I saw it come back on Oct 18, I noticed that the uptime listed in its descriptor was huge -- it was running the whole time!

Currently moria1, tor26, and bastet are voting Running for VictorCryptoUMich, and the other directory authorities do not find it to be reachable.

traceroute from moria1 (looking good):

$ traceroute 35.0.127.52
traceroute to 35.0.127.52 (35.0.127.52), 30 hops max, 60 byte packets
 1  guest.core-1.csail.mit.edu (128.31.0.2)  1.018 ms  1.001 ms  0.989 ms
 2  dmz-rtr-2-csail.mit.edu (18.0.162.141)  0.813 ms  0.861 ms  0.954 ms
 3  dmz-rtr-1-dmz-rtr-2-2.mit.edu (18.0.162.5)  0.620 ms dmz-rtr-1-dmz-rtr-2-1.mit.edu (18.0.161.5)  0.630 ms  0.664 ms
 4  external-rtr-3-dmz-rtr-1.mit.edu (18.0.161.14)  1.121 ms  1.109 ms  1.098 ms
 5  mit-re-nox1sumgw1.nox.org (18.2.4.109)  0.717 ms  0.755 ms  0.749 ms
 6  192.5.89.54 (192.5.89.54)  9.865 ms  9.777 ms  9.847 ms
 7  nox-mghpcc-gw1-i2-re-chic.nox.org (192.5.89.254)  13.526 ms  13.405 ms  13.403 ms
 8  fourhundredge-0-0-0-2.4079.core2.clev.net.internet2.edu (163.253.1.21)  28.516 ms  28.564 ms  28.493 ms
 9  fourhundredge-0-0-0-2.4079.core2.eqch.net.internet2.edu (163.253.2.17)  27.579 ms  27.586 ms  27.574 ms
10  fourhundredge-0-0-0-3.4079.core1.star.net.internet2.edu (163.253.2.75)  27.675 ms  29.705 ms  29.746 ms
11  et-4-3-0-2061.r-bin-seb.umnet.umich.edu (198.71.45.248)  31.197 ms  31.185 ms  31.210 ms
12  l3-binseb1-binseb.r-bin-seb.umnet.umich.edu (192.12.80.17)  31.045 ms  31.041 ms  31.030 ms
13  l3-binseb-seb-net35vrf.r-seb.umnet.umich.edu (192.12.80.131)  31.015 ms  31.004 ms  31.004 ms
14  * * *
15  * * *
16  tor-exit.eecs.umich.edu (35.0.127.52)  31.099 ms  31.088 ms *

traceroute from verizon (also good):

[...]
11  et-4-1-5x3.sfld-cor-123net.mich.net (207.72.230.128)  26.938 ms  26.612 ms  26.558 ms
12  ae106x0.anar-um-ncdc-c1.mich.net (207.72.231.147)  30.159 ms  29.777 ms  29.873 ms
13  ae100x0.anar-um-ncdc-e1.mich.net (207.72.231.117)  30.166 ms  30.401 ms  29.615 ms
14  192.12.80.80 (192.12.80.80)  31.231 ms  30.806 ms  30.467 ms
15  l3-binncas1-binarbl.r-bin-arbl.umnet.umich.edu (192.12.80.15)  28.101 ms  30.983 ms  29.914 ms
16  l3-binarbl-cool-net35vrf.r-cool.umnet.umich.edu (192.12.80.129)  32.076 ms  32.292 ms  32.621 ms
17  * * *
18  * * *
19  tor-exit.eecs.umich.edu (35.0.127.52)  31.537 ms  31.435 ms  31.383 ms

traceroute from maatuska (not good):

traceroute to 35.0.127.52 (35.0.127.52), 30 hops max, 60 byte packets
 1  171.25.193.22 (171.25.193.22)  2.156 ms  1.951 ms  1.963 ms
 2  temporary-gw-232 (171.25.193.232)  2.596 ms  2.542 ms  2.506 ms
 3  195.225.184.149 (195.225.184.149)  2.977 ms  2.882 ms  2.717 ms
 4  be4073.rcr51.b038034-0.sto03.atlas.cogentco.com (149.11.76.121)  2.877 ms  2.870 ms  3.021 ms
 5  be3531.ccr22.sto03.atlas.cogentco.com (154.54.38.37)  4.335 ms be3530.ccr21.sto03.atlas.cogentco.com (130.117.2.93)  4.115 ms be3531.ccr22.sto03.atlas.cogentco.com (154.54.38.37)  5.081 ms
 6  be4593.ccr21.sto01.atlas.cogentco.com (154.54.75.85)  4.254 ms  5.252 ms be4649.ccr21.sto01.atlas.cogentco.com (130.117.3.129)  5.081 ms
 7  telia.sto01.atlas.cogentco.com (130.117.14.234)  5.130 ms  5.717 ms *
 8  sto-bb2-link.ip.twelve99.net (62.115.139.186)  5.855 ms  5.725 ms  5.642 ms
 9  kbn-bb6-link.ip.twelve99.net (62.115.139.173)  15.039 ms  14.866 ms  14.695 ms
10  nyk-bb2-link.ip.twelve99.net (80.91.254.91)  96.380 ms  96.333 ms  96.194 ms
11  det-b3-link.ip.twelve99.net (62.115.137.149)  112.685 ms  112.548 ms  112.499 ms
12  wiscnet-ic-350607.ip.twelve99-cust.net (62.115.181.217)  113.788 ms  113.815 ms  114.365 ms
13  ae106x0.anar-um-ncdc-c1.mich.net (207.72.231.147)  119.153 ms  118.680 ms  118.488 ms
14  ae100x0.anar-um-ncdc-e1.mich.net (207.72.231.117)  118.661 ms  118.580 ms  118.291 ms
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

tcptraceroute from nearby maatuska makes it farther, and based on my traceroute from Verizon, if Linus had waited one more hop before ^C'ing it would have worked:

Tracing the path to 35.0.127.52 on TCP port 80 (http), 30 hops max
 1  rs0.dfri.net (171.25.193.1)  0.207 ms  0.142 ms  0.133 ms
 2  temporary-gw-232 (171.25.193.232)  0.955 ms  0.818 ms  0.818 ms
 3  195.225.184.149  1.045 ms  1.009 ms  0.983 ms
 4  be4073.rcr51.b038034-0.sto03.atlas.cogentco.com (149.11.76.121)  1.240 ms  1.376 ms  1.150 ms
 5  be3530.ccr21.sto03.atlas.cogentco.com (130.117.2.93)  2.642 ms  2.643 ms  3.176 ms
 6  be4593.ccr21.sto01.atlas.cogentco.com (154.54.75.85)  2.891 ms  2.680 ms  3.018 ms
 7  telia.sto01.atlas.cogentco.com (130.117.14.234)  3.312 ms  3.368 ms  3.440 ms
 8  sto-bb2-link.ip.twelve99.net (62.115.139.186)  3.471 ms  3.394 ms  4.245 ms
 9  kbn-bb6-link.ip.twelve99.net (62.115.139.173)  11.312 ms  11.707 ms  12.097 ms
10  nyk-bb2-link.ip.twelve99.net (80.91.254.91)  89.436 ms  89.660 ms  90.413 ms
11  det-b3-link.ip.twelve99.net (62.115.137.149)  106.859 ms  107.023 ms  106.952 ms
12  wiscnet-ic-350607.ip.twelve99-cust.net (62.115.181.217)  112.490 ms  112.397 ms  112.333 ms
13  ae106x0.anar-um-ncdc-c1.mich.net (207.72.231.147)  116.880 ms  117.010 ms  116.746 ms
14  ae100x0.anar-um-ncdc-e1.mich.net (207.72.231.117)  116.645 ms  116.586 ms  116.570 ms
15  192.12.80.80  118.405 ms  118.395 ms  118.390 ms
16  l3-binncas1-binarbl.r-bin-arbl.umnet.umich.edu (192.12.80.15)  118.276 ms  118.233 ms  118.161 ms
17  l3-binarbl-cool-net35vrf.r-cool.umnet.umich.edu (192.12.80.129)  118.309 ms  118.437 ms  118.632 ms
18  * * *
19  * *^C

Is there something new inside the internet stopping this University of Michigan relay from talking to/from some of the directory authorities?

Chapter three

Over the weekend abuse complaints started to come in to at least four of the directory authorities, complaining that we are port scanning the internet on port 22.

We're investigating that question in #85 (closed) but the current best theory is that we're not actually sending those packets: some jerk is spoofing syn packets from the directory authority addresses and sending them places, presumably to make people mad.

I mention these three chapters together because maybe the syn floods have caused some internet administrator somewhere to add some block lines.