CMU and Michigan relays having trouble reaching dir auths
Chapter one of the story
cmutornode (204.194.29.4:9001) has been offline for almost 5 days according to relay-search. But, the relay is still running if you connect to it.
And if you connect to it as a bridge, after a while you get a stream of these log complaints:
Oct 21 20:31:57.202 [warn] Received http status code 404 ("Consensus is too old") from server 204.194.29.4:9001 while fetching consensus directory.
which indicates that it is having trouble fetching a current consensus.
A tcptraceroute to it from an IP address nearby moria1 works:
# tcptraceroute 204.194.29.4 9001
Selected device eno1, address 128.31.0.24, port 37125 for outgoing packets
Tracing the path to 204.194.29.4 on TCP port 9001, 30 hops max
1 guest.core-1.csail.mit.edu (128.31.0.2) 1.170 ms 0.898 ms 0.829 ms
2 dmz-rtr-2-csail.mit.edu (18.0.162.141) 0.783 ms 0.694 ms 0.676 ms
3 dmz-rtr-1-dmz-rtr-2-2.mit.edu (18.0.162.5) 0.621 ms 0.585 ms 0.586 ms
4 external-rtr-3-dmz-rtr-1.mit.edu (18.0.161.14) 1.076 ms 0.916 ms 0.974 ms
5 mit-re-nox1sumgw1.nox.org (18.2.4.109) 0.699 ms 1.022 ms 0.602 ms
6 192.5.89.21 0.749 ms 0.973 ms 0.703 ms
7 i2-re-nox300gw1.nox.org (192.5.89.222) 7.525 ms 7.077 ms 8.224 ms
8 fourhundredge-0-0-0-20.4079.core1.newy32aoa.net.internet2.edu (163.253.1.42) 16.669 ms 16.801 ms 18.265 ms
9 fourhundredge-0-0-0-2.4079.core1.ashb.net.internet2.edu (163.253.1.116) 16.744 ms 16.451 ms 17.531 ms
10 fourhundredge-0-0-0-0.4079.core1.pitt.net.internet2.edu (163.253.1.125) 17.871 ms 17.450 ms 16.740 ms
11 internet2-pitt-rebr-jrt-mi-et-0-0-1-1011.3rox.net (192.88.115.82) 15.596 ms 15.799 ms 15.681 ms
12 rtr-acm.cmu.3rox.net (147.73.16.120) 15.645 ms 15.719 ms 15.670 ms
13 * 192.12.32.20 16.221 ms *
14 CORE2-BORDER-FW.GW.CMU.NET (128.2.5.68) 16.212 ms 16.143 ms 16.317 ms
15 POD-A-CORE2.GW.CMU.NET (128.2.255.154) 16.122 ms 17.894 ms 16.202 ms
16 TOR-EXIT.CYLAB.CMU.EDU (204.194.29.4) [open] 16.548 ms 16.596 ms 16.644 ms
whereas one from moria1 does not work:
# tcptraceroute -s 128.31.0.39 204.194.29.4 9001
Selected device eno1, address 128.31.0.39, port 59161 for outgoing packets
Tracing the path to 204.194.29.4 on TCP port 9001, 30 hops max
1 guest.core-1.csail.mit.edu (128.31.0.2) 1.370 ms 10.742 ms 1.677 ms
2 dmz-rtr-2-csail.mit.edu (18.0.162.141) 0.803 ms 0.738 ms 0.599 ms
3 dmz-rtr-1-dmz-rtr-2-2.mit.edu (18.0.162.5) 0.668 ms 0.621 ms 0.630 ms
4 external-rtr-3-dmz-rtr-1.mit.edu (18.0.161.14) 0.780 ms 0.884 ms 0.981 ms
5 mit-re-nox1sumgw1.nox.org (18.2.4.109) 0.939 ms 0.866 ms 0.896 ms
6 192.5.89.21 0.718 ms 0.707 ms 0.710 ms
7 i2-re-nox300gw1.nox.org (192.5.89.222) 6.724 ms 8.207 ms 7.183 ms
8 fourhundredge-0-0-0-21.4079.core1.newy32aoa.net.internet2.edu (163.253.1.44) 16.535 ms 16.637 ms 17.016 ms
9 fourhundredge-0-0-0-2.4079.core1.ashb.net.internet2.edu (163.253.1.116) 18.319 ms 18.124 ms 17.553 ms
10 fourhundredge-0-0-0-0.4079.core1.pitt.net.internet2.edu (163.253.1.125) 17.085 ms 18.842 ms 17.961 ms
11 internet2-pitt-rebr-jrt-mi-et-0-0-1-1011.3rox.net (192.88.115.82) 15.598 ms 15.587 ms 15.833 ms
12 * * *
13 * * *
14 * * *
15 * * *
16 * * *
17 * * *
It looks like something in between is preventing the connection?
Chapter two of the story
VictorCryptoUMich (35.0.127.52:9001) has been blinking on and off on relay-search. relay-search says it went down around Oct 13, and then came back around Oct 18, but disappeared again on Oct 20. When I saw it come back on Oct 18, I noticed that the uptime listed in its descriptor was huge -- it was running the whole time!
Currently moria1, tor26, and bastet are voting Running for VictorCryptoUMich, and the other directory authorities do not find it to be reachable.
traceroute from moria1 (looking good):
$ traceroute 35.0.127.52
traceroute to 35.0.127.52 (35.0.127.52), 30 hops max, 60 byte packets
1 guest.core-1.csail.mit.edu (128.31.0.2) 1.018 ms 1.001 ms 0.989 ms
2 dmz-rtr-2-csail.mit.edu (18.0.162.141) 0.813 ms 0.861 ms 0.954 ms
3 dmz-rtr-1-dmz-rtr-2-2.mit.edu (18.0.162.5) 0.620 ms dmz-rtr-1-dmz-rtr-2-1.mit.edu (18.0.161.5) 0.630 ms 0.664 ms
4 external-rtr-3-dmz-rtr-1.mit.edu (18.0.161.14) 1.121 ms 1.109 ms 1.098 ms
5 mit-re-nox1sumgw1.nox.org (18.2.4.109) 0.717 ms 0.755 ms 0.749 ms
6 192.5.89.54 (192.5.89.54) 9.865 ms 9.777 ms 9.847 ms
7 nox-mghpcc-gw1-i2-re-chic.nox.org (192.5.89.254) 13.526 ms 13.405 ms 13.403 ms
8 fourhundredge-0-0-0-2.4079.core2.clev.net.internet2.edu (163.253.1.21) 28.516 ms 28.564 ms 28.493 ms
9 fourhundredge-0-0-0-2.4079.core2.eqch.net.internet2.edu (163.253.2.17) 27.579 ms 27.586 ms 27.574 ms
10 fourhundredge-0-0-0-3.4079.core1.star.net.internet2.edu (163.253.2.75) 27.675 ms 29.705 ms 29.746 ms
11 et-4-3-0-2061.r-bin-seb.umnet.umich.edu (198.71.45.248) 31.197 ms 31.185 ms 31.210 ms
12 l3-binseb1-binseb.r-bin-seb.umnet.umich.edu (192.12.80.17) 31.045 ms 31.041 ms 31.030 ms
13 l3-binseb-seb-net35vrf.r-seb.umnet.umich.edu (192.12.80.131) 31.015 ms 31.004 ms 31.004 ms
14 * * *
15 * * *
16 tor-exit.eecs.umich.edu (35.0.127.52) 31.099 ms 31.088 ms *
traceroute from verizon (also good):
[...]
11 et-4-1-5x3.sfld-cor-123net.mich.net (207.72.230.128) 26.938 ms 26.612 ms 26.558 ms
12 ae106x0.anar-um-ncdc-c1.mich.net (207.72.231.147) 30.159 ms 29.777 ms 29.873 ms
13 ae100x0.anar-um-ncdc-e1.mich.net (207.72.231.117) 30.166 ms 30.401 ms 29.615 ms
14 192.12.80.80 (192.12.80.80) 31.231 ms 30.806 ms 30.467 ms
15 l3-binncas1-binarbl.r-bin-arbl.umnet.umich.edu (192.12.80.15) 28.101 ms 30.983 ms 29.914 ms
16 l3-binarbl-cool-net35vrf.r-cool.umnet.umich.edu (192.12.80.129) 32.076 ms 32.292 ms 32.621 ms
17 * * *
18 * * *
19 tor-exit.eecs.umich.edu (35.0.127.52) 31.537 ms 31.435 ms 31.383 ms
traceroute from maatuska (not good):
traceroute to 35.0.127.52 (35.0.127.52), 30 hops max, 60 byte packets
1 171.25.193.22 (171.25.193.22) 2.156 ms 1.951 ms 1.963 ms
2 temporary-gw-232 (171.25.193.232) 2.596 ms 2.542 ms 2.506 ms
3 195.225.184.149 (195.225.184.149) 2.977 ms 2.882 ms 2.717 ms
4 be4073.rcr51.b038034-0.sto03.atlas.cogentco.com (149.11.76.121) 2.877 ms 2.870 ms 3.021 ms
5 be3531.ccr22.sto03.atlas.cogentco.com (154.54.38.37) 4.335 ms be3530.ccr21.sto03.atlas.cogentco.com (130.117.2.93) 4.115 ms be3531.ccr22.sto03.atlas.cogentco.com (154.54.38.37) 5.081 ms
6 be4593.ccr21.sto01.atlas.cogentco.com (154.54.75.85) 4.254 ms 5.252 ms be4649.ccr21.sto01.atlas.cogentco.com (130.117.3.129) 5.081 ms
7 telia.sto01.atlas.cogentco.com (130.117.14.234) 5.130 ms 5.717 ms *
8 sto-bb2-link.ip.twelve99.net (62.115.139.186) 5.855 ms 5.725 ms 5.642 ms
9 kbn-bb6-link.ip.twelve99.net (62.115.139.173) 15.039 ms 14.866 ms 14.695 ms
10 nyk-bb2-link.ip.twelve99.net (80.91.254.91) 96.380 ms 96.333 ms 96.194 ms
11 det-b3-link.ip.twelve99.net (62.115.137.149) 112.685 ms 112.548 ms 112.499 ms
12 wiscnet-ic-350607.ip.twelve99-cust.net (62.115.181.217) 113.788 ms 113.815 ms 114.365 ms
13 ae106x0.anar-um-ncdc-c1.mich.net (207.72.231.147) 119.153 ms 118.680 ms 118.488 ms
14 ae100x0.anar-um-ncdc-e1.mich.net (207.72.231.117) 118.661 ms 118.580 ms 118.291 ms
15 * * *
16 * * *
17 * * *
18 * * *
19 * * *
20 * * *
21 * * *
22 * * *
23 * * *
24 * * *
25 * * *
26 * * *
27 * * *
28 * * *
29 * * *
30 * * *
tcptraceroute from nearby maatuska makes it farther, and based on my traceroute from Verizon, if Linus had waited one more hop before ^C'ing it would have worked:
Tracing the path to 35.0.127.52 on TCP port 80 (http), 30 hops max
1 rs0.dfri.net (171.25.193.1) 0.207 ms 0.142 ms 0.133 ms
2 temporary-gw-232 (171.25.193.232) 0.955 ms 0.818 ms 0.818 ms
3 195.225.184.149 1.045 ms 1.009 ms 0.983 ms
4 be4073.rcr51.b038034-0.sto03.atlas.cogentco.com (149.11.76.121) 1.240 ms 1.376 ms 1.150 ms
5 be3530.ccr21.sto03.atlas.cogentco.com (130.117.2.93) 2.642 ms 2.643 ms 3.176 ms
6 be4593.ccr21.sto01.atlas.cogentco.com (154.54.75.85) 2.891 ms 2.680 ms 3.018 ms
7 telia.sto01.atlas.cogentco.com (130.117.14.234) 3.312 ms 3.368 ms 3.440 ms
8 sto-bb2-link.ip.twelve99.net (62.115.139.186) 3.471 ms 3.394 ms 4.245 ms
9 kbn-bb6-link.ip.twelve99.net (62.115.139.173) 11.312 ms 11.707 ms 12.097 ms
10 nyk-bb2-link.ip.twelve99.net (80.91.254.91) 89.436 ms 89.660 ms 90.413 ms
11 det-b3-link.ip.twelve99.net (62.115.137.149) 106.859 ms 107.023 ms 106.952 ms
12 wiscnet-ic-350607.ip.twelve99-cust.net (62.115.181.217) 112.490 ms 112.397 ms 112.333 ms
13 ae106x0.anar-um-ncdc-c1.mich.net (207.72.231.147) 116.880 ms 117.010 ms 116.746 ms
14 ae100x0.anar-um-ncdc-e1.mich.net (207.72.231.117) 116.645 ms 116.586 ms 116.570 ms
15 192.12.80.80 118.405 ms 118.395 ms 118.390 ms
16 l3-binncas1-binarbl.r-bin-arbl.umnet.umich.edu (192.12.80.15) 118.276 ms 118.233 ms 118.161 ms
17 l3-binarbl-cool-net35vrf.r-cool.umnet.umich.edu (192.12.80.129) 118.309 ms 118.437 ms 118.632 ms
18 * * *
19 * *^C
Is there something new inside the internet stopping this University of Michigan relay from talking to/from some of the directory authorities?
Chapter three
Over the weekend abuse complaints started to come in to at least four of the directory authorities, complaining that we are port scanning the internet on port 22.
We're investigating that question in #85 (closed) but the current best theory is that we're not actually sending those packets: some jerk is spoofing syn packets from the directory authority addresses and sending them places, presumably to make people mad.
I mention these three chapters together because maybe the syn floods have caused some internet administrator somewhere to add some block lines.