Some relays operators are starting to notice that we publish the bandwidth ratio from onbasca in their status page, and asking why some bridges have it and some don't. Triggered by that I added to grafana a visualization of how many bridges are functional by bridgestrap but untested by onbasca. Out of 1914 obfs4 bridges 289 are in that state, so 15% of obfs4 bridges.
It looks like onbasca is failing to test those 15% of bridges.
Let me know if you need logs or something to help you debugging this problem.
Designs
Child items
0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items
0
Link issues together to show that they're related.
Learn more.
Related merge requests
3
When these merge requests are accepted, this issue will be closed automatically.
@gk and i were checking logs and it turned out the bridge scanner was not running, cause stem timed out launching tor.
It was solved by removing ~/.onbrisca/tor dir and launching the scanner again. This has happened also to me with sbws, when i had an old tor directory. We don't know whether this is a C tor or stem bug. For now we'll live with it since we'll move to arti at some point.
I'm leaving this issue open until checking in ~5 days whether the number of functional bridges without ratio has decreased or there's another bug.
I'm leaving this issue open until checking in ~5 days whether the number of functional bridges without ratio has decreased or there's another bug.
bridgescan stopped again. Look what we got:
Jul 11 19:58:24 bridge_scanner[3117673]: <ERROR> (MainThread) bridge_scanner.py:204 - run - Unacceptable option value: Bridge line did not parse. See logs for details.Traceback (most recent call last): File "/home/onbasca/onbasca/onbrisca/models/bridge_scanner.py", line 198, in run self.scan() File "/home/onbasca/onbasca/onbrisca/models/bridge_scanner.py", line 190, in scan self.scan_bridges() File "/home/onbasca/onbasca/onbrisca/models/bridge_scanner.py", line 80, in scan_bridges self.loop.run_until_complete(self._scan_bridges()) File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete return future.result() File "/home/onbasca/onbasca/onbrisca/models/bridge_scanner.py", line 89, in _scan_bridges await sync_to_async(self.tor_control.set_bridgelines)(bridges) File "/home/onbasca/env/lib/python3.9/site-packages/asgiref-3.6.0-py3.9.egg/asgiref/sync.py", line 448, in __call__ ret = await asyncio.wait_for(future, timeout=None) File "/usr/lib/python3.9/asyncio/tasks.py", line 442, in wait_for return await fut File "/usr/lib/python3.9/concurrent/futures/thread.py", line 52, in run result = self.fn(*self.args, **self.kwargs) File "/home/onbasca/env/lib/python3.9/site-packages/asgiref-3.6.0-py3.9.egg/asgiref/sync.py", line 490, in thread_handler return func(*args, **kwargs) File "/home/onbasca/onbasca/onbrisca/bridge_torcontrol.py", line 45, in set_bridgelines self.controller.set_conf("Bridge", new_bridgelines) File "/usr/lib/python3/dist-packages/stem/control.py", line 2473, in set_conf self.set_options({param: value}, False) File "/usr/lib/python3/dist-packages/stem/control.py", line 2567, in set_options raise stem.InvalidRequest(response.code, response.message)stem.InvalidRequest: Unacceptable option value: Bridge line did not parse. See logs for details.
The reason seems to be that tor can't parse a bridgeline like this "[]:0". If it's in the database is because it passes the regular expressions when obtaining the bridgelines via http (also in bridgestrap?).
I'm solving this by catching stem's exceptions trying to set bridgelines one by one.
I still see this issue. Currently there are 562 obfs4 bridges that bridgestrap say are functional but onbasca keeps on untested, and this number seems fairly constant in the graphs. There are ~2000 obfs4 functional bridges, that means that 1 out of 4 bridges are failing to be tested by onbasca.
I don't think there is any urgency to fix it as we distribute untested bridges anyway. But we might want to keep this open for the time we get around to work on onbasca.
I don't know because there is only 1 year of history of metrics in prometheus. But you can see here the trend of obfs4 functional bridges without ratio for the last year:
Not sure why the slow increase. The short falls might be restarts of rdsys because on those moments there are no functional bridges.