Error decoding state.dat because is malformed
Traceback:
Oct 03 20:32:31 INFO MainThread scanner.py:606 - wait_for_results - Pending measurements: 0 out of 373:
Oct 03 20:35:31 DEBUG MainThread filelock.py:17 - __enter__ - Going to lock /var/lib/sbws/state.dat.lockfile
Oct 03 20:35:31 DEBUG MainThread filelock.py:23 - __enter__ - Received lock /var/lib/sbws/state.dat.lockfile
Oct 03 20:35:31 DEBUG MainThread filelock.py:27 - __exit__ - Releasing lock /var/lib/sbws/state.dat.lockfile
Oct 03 20:35:31 CRITICAL MainThread scanner.py:719 - run_speedtest - Something went wrong.
Please create an issue at
https://gitlab.torproject.org/tpo/network-health/sbws/-/issues with this
traceback.
Oct 03 20:35:31 ERROR MainThread scanner.py:720 - run_speedtest - Unterminated string starting at: line 26312 column 9 (char 811003)
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/core/scanner.py", line 712, in run_speedtest
main_loop(args, conf, controller, rl, cb, rd, rp, destinations, pool)
File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/core/scanner.py", line 549, in main_loop
hbeat.print_heartbeat_message()
File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/lib/heartbeat.py", line 48, in print_heartbeat_message
loops_count = self.state_dict.count('recent_priority_list')
File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/util/state.py", line 103, in count
if self.get(k):
File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/util/state.py", line 69, in get
self._state = self._read()
File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/util/state.py", line 53, in _read
return json.load(fd, cls=CustomDecoder)
File "/usr/lib/python3.7/json/__init__.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "/usr/lib/python3.7/json/__init__.py", line 361, in loads
return cls(**kw).decode(s)
File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/util/json.py", line 24, in decode
decoded = super().decode(s, **kwargs)
File "/usr/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.7/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 26312 column 9 (char 811003)
Oct 03 20:35:31 DEBUG MainThread scanner.py:53 - stop_threads - Stopping sbws.
The last key not well closed in state.dat is:
"2020-09-28T20:16:51",
And the last lines in that file:
"2020-10-02T20:35:49",
"2020
So the last thread that wrote the json didn't finish for some reason.
The last bandwidth file generated headers:
1601753789
version=1.4.0
destinations_countries=ZZ
earliest_bandwidth=2020-09-28T19:37:15
file_created=2020-10-03T19:37:21
generator_started=2020-09-07T11:57:42
latest_bandwidth=2020-10-03T19:36:29
minimum_number_eligible_relays=4220
minimum_percent_eligible_relays=60
number_consensus_relays=7034
number_eligible_relays=7207
percent_eligible_relays=102
recent_consensus_count=120
recent_measurement_attempt_count=31909
recent_measurement_failure_count=68
recent_measurements_excluded_error_count=521
recent_measurements_excluded_few_count=0
recent_measurements_excluded_near_count=0
recent_measurements_excluded_old_count=0
recent_priority_list_count=88
recent_priority_relay_count=31909
scanner_country=US
software=sbws
software_version=1.1.0+155.g4b29a49
time_to_report_half_network=210793
tor_version=0.4.2.
So recent_measurement_attempt_count
was already counting 31909, which means there were 31909 json lines with the recent measurement attempts.
Maybe the server run out of memory (4GB)?, should that have cause another expection on writing before reading?. It seems it finally happened what i was afraid of using json, failure to handle it.
To check whether that's the problem, we could print the memory used with psutil (psutil.virtual_memory().percent
)
Until we use a database, an easy option is to don't store the attempts and report it as 0, even though it'd not follow the definition in the spec.