Stop running twice destination usability tests

Trac:
Parent Ticket: #28663 (moved)

added component::core tor/sbws milestone::sbws: 1.0.x-final owner::juga parent::28663 priority::medium resolution::fixed reviewer::teor severity::normal status::closed type::defect version::sbws 1.0.2 labels

I think this is what is making sbws stalls. A backtrace [0] in the moment it's stalled shows two threads trying to get the next destination. It locks https://github.com/torproject/sbws/blob/ee64d76df54ceb3a3c9e1e2a797fd70d68bb0035/sbws/lib/destination.py#L248, then while True, then if it enters in _perform_usability_test, will lock again, then call _is_usable, which call connect_destination_over_circuit which locks again. If it fails then it sleep. Many times connect_to_destination_over_circuit fails several times in a row, cause it's not reliable to do it through Tor and random relays. If _usable_dests is not overwritten every time that a destination is chosen, then a lock is only needed in connect_destination_over_circuit. It would be better to refactor all this code to trace it and debug it easier. As a temporal solution, the "usability" can be tracked out of the class and without locks. For now i'd just disable checking for usability, and prioritize the refactoring ticket as soon as 1.0 is done.

[0] https://paste.debian.net/hidden/82fc9ec0/ Edit: add link

Trac:
Status: assigned to needs_review

I thought that there's no need of a big refactor to check if a destination is "usable". Every time that a destination is used (in connect_destination_over_circuit) and it fails, it can be recorded since there's already a lock. Putting this in revision to implement this before review.

Trac:
Status: needs_review to needs_revision

Implemented https://github.com/torproject/sbws/pull/320 Edit: add the GH PR

Trac:
Status: needs_revision to needs_review

Trac:
Reviewer: N/A to teor

Reminder: when merging into master, add additional commit or fixup to change circuit_id = cb.build_circuit(circuit_path) into circuit_id, _ = cb.build_circuit(circuit_path) [0], as #28736 (moved) changed build_circuit to return a tuple.

[0] https://github.com/torproject/sbws/pull/320/commits/b0aaf5d806276bfa946422d016da8506cc48a209#diff-9714c4c15b47a818a1e2537f9e1cd6f2R36 and https://github.com/torproject/sbws/pull/320/commits/b0aaf5d806276bfa946422d016da8506cc48a209#diff-9714c4c15b47a818a1e2537f9e1cd6f2R57

Merged in a branch with current master and current needs_review tickets and tested a whole loop with the public Tor network

Replying to juga:

Reminder: when merging into master, add additional commit or fixup to change circuit_id = cb.build_circuit(circuit_path) into circuit_id, _ = cb.build_circuit(circuit_path) [0], as #28736 (moved) changed build_circuit to return a tuple.

[0] https://github.com/torproject/sbws/pull/320/commits/b0aaf5d806276bfa946422d016da8506cc48a209#diff-9714c4c15b47a818a1e2537f9e1cd6f2R36 and https://github.com/torproject/sbws/pull/320/commits/b0aaf5d806276bfa946422d016da8506cc48a209#diff-9714c4c15b47a818a1e2537f9e1cd6f2R57

I reviewed the pull request. I suggested some design changes to help sbws recover better from temporary failures.

Can you please do a rebase or merge before the next review, so I can review the code that will be merged to master?

Thanks!

Trac:
Status: needs_review to needs_revision

I fixed the non-design suggested changes, created #29589 (moved) to change the design and created https://github.com/torproject/sbws/pull/339 squashing the fixups, rebasing to master and adapting to the new changes in master.

Trac:
Status: needs_revision to needs_review

Thanks, the fixups and squashed PR seem fine to me.

Trac:
Status: needs_review to merge_ready

Thanks, merged.

Trac:
Status: merge_ready to closed
Resolution: N/A to fixed

closed

mentioned in issue #28933 (moved)

mentioned in issue #29589 (moved)

moved to tpo/network-health/sbws#28897 (closed)

mentioned in issue tpo/network-health/sbws#29589 (closed)

Stop running twice destination usability tests

Child items 0

Activity