We have a growing set of bridges that firewall their ORPort because clients connect on their obfs4 port. The goal is to slow enumeration attacks by not exposing an ORPort. These started off as the default bridges shipped in Tor Browser, but now that we're encouraging more bridge operators to do it, the issue is becoming broader, and I'm seeing a growing number of bridge operators get confused.
But because the bridge authority only knows how to test the ORPort, it doesn't give the bridges the Running flag. The bridgestrap tool measures obfs4 reachability, and outputs a file that bridgedb/rdsys use to decide whether the bridge is actually worth giving out.
But in the mean time, on the relay-search page, the bridge gets a red circle with a tooltip saying "This bridge is offline". See e.g. GeorgetownPontem, https://metrics.torproject.org/rs.html#details/7C95AED7256E1D10D134942532CC72AD73AC1BD8
We should pull in the data set from bridgestrap, like bridgedb/rdsys does, and if the bridge has an obfs4 port, and bridgestrap gives it a thumbs up, then we should give it a green circle on relay-search too.
I think we are done here with the bridgestrap improvements we have, see tpo/network-health/team#318 for details. That's probably as good as we can get right now due to bridgestrap
limitations. For further improvements we might need tpo/core/arti#717 fixed.
If we have a way to know that the relay is older than 5 years, we should check it when deciding whether to display this banner. :)
Yeah. I think we have a way to check:
Another angle to consider here is the "This relay appears to be less than 2 weeks old." banner -- this banner is showing up as a false positive on the relays that are impacted by this bug, and maybe that too is confusing people.
If we have a way to know that the relay is older than 5 years, we should check it when deciding whether to display this banner. :)
baurine from irc was kind enough to notice that bgp.tools has a uni category, and to write the attached script to compare onionoo's set of Tor relays to bgp.tools's set of uni netblocks
Right, I think we definitely don't want a general notification bar -- too many people would see it when they aren't even planning to look at first-seen.
I was thinking some note right next to first-seen on the page, like "First Seen (currently buggy)" with the currently buggy being a link to the gitlab ticket.
But yeah, if we know that there are some relays that are probably impacted and some relays that definitely aren't, we can be more precise on when we change the page this way.
We had issues with our ´first_seenb` date for relays (see: onionoo#40042). It would be smart to show some notification on relay-search as we still have corner cases (e.g. for relays older than 5 years) where the bug is unfixed.
We could think about a general notification bar but it would be neat if we only showed the notification for relays that are actually affected, if possible.
Just discovered, that my at least 8+ years old (since 2015-04-15 23:00:00) relay is labeled as "This relay appears to be less than 2 weeks old" in https://metrics.torproject.org/rs.html
Clicked thru few of these relays, same story with all of them. Some show last-restart date that is older than first-seen date...
I think we are done here. This got resolved with the help of onionoo#40042 (closed) as good as we could. Full recovery will only be possible after we deploy our new metrics pipeline.
So, it seemed that box was down or otherwise not accessible around 2024-01-29 19:00:00? FWIW comparing the output of
grep "2024-01-29 19:00:00" -r . | wc -l
run both on backend-01 and backend-02 gives 7106(!) on the former and 4 on the latter (2 out of those 4 werelast_seen
at that time, 1 (BF173C1B547791C5BD640B4A688AD568B4C06F03) had its last changed address or port then and only 34FEE8FEF1980AC34671A80FD97BDE7076EE912C has it asfirst_seen
at that point in time, which seems to be a mistake).
Doing a grep "\"first_seen\":\"2024-01-10 16:00:00\"" -r . | wc -l
on backend-01 and backend-02 gives 1 on the former and 8075(!) on the latter. As I said they are differently busted...
On 2024-01-29 we had some Run onionoo update service. Next run will start in 60 minutes.
notices to our metrics-alerts alias sent (for both backends) between 1851UTC and 2049UTC, indicating both got restarted during that timeframe several times. Only for backend-01 we got additionally at 2045UTC:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 96, in create_connection
raise err
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 86, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 704, in urlopen
httplib_response = self._make_request(
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 387, in _make_request
self._validate_conn(conn)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 1045, in _validate_conn
conn.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 358, in connect
self.sock = conn = self._new_conn()
^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f98cb9a8b90>: Failed to establish a new connection: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 489, in send
resp = conn.urlopen(
^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 788, in urlopen
retries = retries.increment(
^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='onionoo.torproject.org', port=443): Max retries exceeded with url: /details (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f98cb9a8b90>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/srv/onionoo.torproject.org/onionoo/monitoring/network/onionoo", line 28, in <module>
details = json.loads(requests.get("https://onionoo.torproject.org/details").text)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 565, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='onionoo.torproject.org', port=443): Max retries exceeded with url: /details (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f98cb9a8b90>: Failed to establish a new connection: [Errno 111] Connection refused'))
So, it seemed that box was down or otherwise not accessible around 2024-01-29 19:00:00? FWIW comparing the output of grep "2024-01-29 19:00:00" -r . | wc -l
run both on backend-01 and backend-02 gives 7106(!) on the former and 4 on the latter (2 out of those 4 were last_seen
at that time, 1 (BF173C1B547791C5BD640B4A688AD568B4C06F03) had its last changed address or port then and only 34FEE8FEF1980AC34671A80FD97BDE7076EE912C has it as first_seen
at that point in time, which seems to be a mistake).
We had Onionoo issues reported as early as
Service: network service - onionoo backend
Host: onionoo-backend-02
Address: 116.202.120.178
State: CRITICAL
Date/Time: Tue Jan 9 19:57:43 UTC 2024
Additional Info:
CRITICAL: Either timestamp is too old (21660 sec): relays_published=2024-01-09 18:00:00, bridges_published=2024-01-09 13:56:38
which got recovered shortly before 2024-01-10-1600:
Notification Type: RECOVERY
Service: network service - onionoo backend
Host: onionoo-backend-02
Address: 116.202.120.178
State: OK
Date/Time: Wed Jan 10 15:42:42 UTC 2024
Additional Info:
OK: Valid response with recent timestamps (6359 sec): relays_published=2024-01-10 14:00:00, bridges_published=2024-01-10 13:56:39
which goes both for our two Onionoo backends/frontends/caches which got caused by collector#40038 (closed)
Interstingly enough, though, we see more than a 1000 bridges with 2024-01-10 16:26:39
as first_seen
. Now, look at the time the bridge consensuses were made available:
-rw-r--r-- 1 thomas thomas 412987 10. Jan 14:09 20240110-002639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412930 10. Jan 14:09 20240110-005639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 413259 10. Jan 14:09 20240110-012639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 413412 10. Jan 14:09 20240110-015639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 413244 10. Jan 14:09 20240110-022639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 413279 10. Jan 14:09 20240110-025639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 413269 10. Jan 14:09 20240110-032639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 413258 10. Jan 14:09 20240110-035639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412927 10. Jan 14:09 20240110-042639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412777 10. Jan 14:09 20240110-045639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412770 10. Jan 14:09 20240110-052639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412618 10. Jan 14:09 20240110-055639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412467 10. Jan 14:09 20240110-062639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412763 10. Jan 14:09 20240110-065639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412613 10. Jan 14:09 20240110-072639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412750 10. Jan 14:09 20240110-075639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412905 10. Jan 14:09 20240110-082639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412937 10. Jan 14:09 20240110-085639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412615 10. Jan 14:09 20240110-092639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412608 10. Jan 14:09 20240110-095639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412456 10. Jan 14:09 20240110-102639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412650 10. Jan 14:09 20240110-105639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412657 10. Jan 14:09 20240110-112639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412804 10. Jan 14:10 20240110-115639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412953 10. Jan 14:09 20240110-122639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 413096 10. Jan 14:10 20240110-125639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412943 10. Jan 14:09 20240110-132639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 413241 10. Jan 14:09 20240110-135639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412967 10. Jan 15:09 20240110-142639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 413000 10. Jan 15:09 20240110-145639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412825 10. Jan 16:09 20240110-152639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412960 10. Jan 16:09 20240110-155639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412958 10. Jan 17:09 20240110-162639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412966 10. Jan 17:09 20240110-165639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412457 10. Jan 18:09 20240110-172639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412452 10. Jan 18:09 20240110-175639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412524 10. Jan 19:09 20240110-182640-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412725 10. Jan 19:09 20240110-185639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 413046 10. Jan 22:09 20240110-192639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412728 10. Jan 22:09 20240110-195640-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412376 10. Jan 22:09 20240110-202640-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412163 10. Jan 22:09 20240110-205640-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 412008 10. Jan 22:09 20240110-212639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 411701 10. Jan 22:09 20240110-215640-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 411505 11. Jan 02:09 20240110-222640-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 411870 11. Jan 02:09 20240110-225639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 411539 11. Jan 02:09 20240110-232639-BA44A889E64B93FAA2B114E02C2A279A8555C533
-rw-r--r-- 1 thomas thomas 411423 11. Jan 02:09 20240110-235640-BA44A889E64B93FAA2B114E02C2A279A8555C533
It's 2024-01-10 14-09-00 until 20240110-142639-BA44A889E64B93FAA2B114E02C2A279A8555C533 for all consensuses on that day.
Both backends are busted, differently, though. :(
Some thoughts I had while looking a round a bit:
12:36 <GeKo> i think we are hitting partially at least
https://gitlab.torproject.org/tpo/network-health/metrics/onionoo/-/issues/40018
12:38 <GeKo> e.g. there is no 2024-01-10-15-00-00-consensus
12:38 <GeKo> and we have a lot of relays with first seen 2024-01-10 16:00:00
12:39 <GeKo> so, i think those onionoo issues might be related to the recent issues
we had for creating a consenus for relays
If you dont have surefire way to determine if data in backend is correct or not, will that help if I give you my relay hash to use as one of the indicators? Just not in public ticket...
Meanwhile, found few anomalous responses:
2024-02-02T15:02:25+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_01) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_01)
2024-02-02T15:03:26+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_02) 116.202.120.171 2024-01-10 16:00:00 onionoo-frontend-01(be_onionoo_backend_02)
2024-02-02T15:04:26+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_02) 116.202.120.171 2024-01-10 16:00:00 onionoo-frontend-01(be_onionoo_backend_02)
2024-02-02T15:05:27+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_02) 116.202.120.171 2024-01-10 16:00:00 onionoo-frontend-01(be_onionoo_backend_02)
2024-02-02T15:06:28+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_02) 116.202.120.171 2024-01-10 16:00:00 onionoo-frontend-01(be_onionoo_backend_02)
2024-02-02T15:07:29+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_02) 116.202.120.171 2024-01-10 16:00:00 onionoo-frontend-01(be_onionoo_backend_02)
2024-02-02T15:08:29+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_01) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_01)
Seems like be_onionoo_backend_02 from onionoo-frontend-02 and onionoo-frontend-01 differs at times. Perhaps its cached response, dont know. But thats what it looks like from outside. You could verify it querying same thing directly from backends to avoid frontend-caches.
If these headers (backend) mean anything then I can only say that for me, asking about my relay, they both look wrong.
2024-02-02T09:06:31+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_02) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_01)
2024-02-02T09:07:31+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_02) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_01)
2024-02-02T09:08:32+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_01) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_01)
2024-02-02T09:09:33+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_01) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_01)
2024-02-02T09:10:33+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_01) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_01)
Seems the data is busted on onionoo-backend-02 and not 01. We have a proposal to reduce this architecture to just 1 backend to begin with: tpo/tpa/team#41512 and then see if the issue is repeating.
... and now they are consistently incorrect - 2024-01-29 19:00:00
2024-01-30T14:59:29+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_01) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_01)
2024-01-30T15:00:29+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_02) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_02)
Back to "inconsistently incorrect" again...
2024-01-30T15:43:52+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_01) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_01)
2024-01-30T15:44:52+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_01) 116.202.120.171 2024-01-10 16:00:00 onionoo-frontend-01(be_onionoo_backend_02)
Weird change happened - previously correctly behaving 'be_onionoo_backend_01' has been giving me first_seen value for my relay as 2015-04-15 23:00:00 and incorrect 'be_onionoo_backend_02' as 2024-01-10 16:00:00.
Now at some point 'be_onionoo_backend_01' went to 2024-01-29 19:00:00 ...
2024-01-29T22:30:59+0200 49.12.57.137 2024-01-10 16:00:00 onionoo-frontend-02(be_onionoo_backend_02) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_01)
2024-01-29T22:31:59+0200 49.12.57.137 2024-01-29 19:00:00 onionoo-frontend-02(be_onionoo_backend_01) 116.202.120.171 2024-01-29 19:00:00 onionoo-frontend-01(be_onionoo_backend_01)
So, now they both misbehave. And in different ways...