Skip to content
GitLab
  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • S sbws
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 5
    • Issues 5
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • The Tor Project
  • Network Health
  • sbws
  • Issues
  • #40020
Closed
Open
Created Oct 06, 2020 by juga@jugaMaintainer

Error decoding state.dat because is malformed

Traceback:

Oct 03 20:32:31 INFO MainThread scanner.py:606 - wait_for_results - Pending measurements: 0 out of 373: 
Oct 03 20:35:31 DEBUG MainThread filelock.py:17 - __enter__ - Going to lock /var/lib/sbws/state.dat.lockfile
Oct 03 20:35:31 DEBUG MainThread filelock.py:23 - __enter__ - Received lock /var/lib/sbws/state.dat.lockfile
Oct 03 20:35:31 DEBUG MainThread filelock.py:27 - __exit__ - Releasing lock /var/lib/sbws/state.dat.lockfile
Oct 03 20:35:31 CRITICAL MainThread scanner.py:719 - run_speedtest - Something went wrong.
Please create an issue at
https://gitlab.torproject.org/tpo/network-health/sbws/-/issues with this
traceback.
Oct 03 20:35:31 ERROR MainThread scanner.py:720 - run_speedtest - Unterminated string starting at: line 26312 column 9 (char 811003)
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/core/scanner.py", line 712, in run_speedtest
    main_loop(args, conf, controller, rl, cb, rd, rp, destinations, pool)
  File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/core/scanner.py", line 549, in main_loop
    hbeat.print_heartbeat_message()
  File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/lib/heartbeat.py", line 48, in print_heartbeat_message
    loops_count = self.state_dict.count('recent_priority_list')
  File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/util/state.py", line 103, in count
    if self.get(k):
  File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/util/state.py", line 69, in get
    self._state = self._read()
  File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/util/state.py", line 53, in _read
    return json.load(fd, cls=CustomDecoder)
  File "/usr/lib/python3.7/json/__init__.py", line 296, in load
    parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
  File "/usr/lib/python3.7/json/__init__.py", line 361, in loads
    return cls(**kw).decode(s)
  File "/usr/local/lib/python3.7/dist-packages/sbws-1.1.0+155.g4b29a49-py3.7.egg/sbws/util/json.py", line 24, in decode
    decoded = super().decode(s, **kwargs)
  File "/usr/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.7/json/decoder.py", line 353, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 26312 column 9 (char 811003)
Oct 03 20:35:31 DEBUG MainThread scanner.py:53 - stop_threads - Stopping sbws.

The last key not well closed in state.dat is:

        "2020-09-28T20:16:51",

And the last lines in that file:

        "2020-10-02T20:35:49",
        "2020

So the last thread that wrote the json didn't finish for some reason.

The last bandwidth file generated headers:

1601753789
version=1.4.0
destinations_countries=ZZ
earliest_bandwidth=2020-09-28T19:37:15
file_created=2020-10-03T19:37:21
generator_started=2020-09-07T11:57:42
latest_bandwidth=2020-10-03T19:36:29
minimum_number_eligible_relays=4220
minimum_percent_eligible_relays=60
number_consensus_relays=7034
number_eligible_relays=7207
percent_eligible_relays=102
recent_consensus_count=120
recent_measurement_attempt_count=31909
recent_measurement_failure_count=68
recent_measurements_excluded_error_count=521
recent_measurements_excluded_few_count=0
recent_measurements_excluded_near_count=0
recent_measurements_excluded_old_count=0
recent_priority_list_count=88
recent_priority_relay_count=31909
scanner_country=US
software=sbws
software_version=1.1.0+155.g4b29a49
time_to_report_half_network=210793
tor_version=0.4.2.

So recent_measurement_attempt_count was already counting 31909, which means there were 31909 json lines with the recent measurement attempts.

Maybe the server run out of memory (4GB)?, should that have cause another expection on writing before reading?. It seems it finally happened what i was afraid of using json, failure to handle it.

To check whether that's the problem, we could print the memory used with psutil (psutil.virtual_memory().percent)

Until we use a database, an easy option is to don't store the attempts and report it as 0, even though it'd not follow the definition in the spec.

Edited Oct 06, 2020 by juga
Assignee
Assign to
Time tracking