sbws AsyncResults have no timeout
After sbws queues an AsyncResult
, it will wait forever for the result to be ready:
https://github.com/torproject/sbws/blob/ee64d76df54ceb3a3c9e1e2a797fd70d68bb0035/sbws/core/scanner.py#L359-L364
If at least one result hangs, then sbws will hang, because AsyncResult.ready()
does not have a timeout.
Instead, sbws should call AsyncResult.wait([timeout])
on each result, after calling pool.apply_async()
on a large number of results.
See https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.AsyncResult.wait
Designs
- Show closed items
Activity
-
Newest first Oldest first
-
Show all activity Show comments only Show history only
- teor changed milestone to %sbws: 1.1.x-final in legacy/trac
changed milestone to %sbws: 1.1.x-final in legacy/trac
Trac:
Parent Ticket: legacy/trac#28663 (moved)- teor added component::core tor/sbws in Legacy / Trac milestone::sbws: 1.1.x-final in Legacy / Trac owner::juga in Legacy / Trac parent::28663 in Legacy / Trac points::1 in Legacy / Trac priority::medium in Legacy / Trac resolution::fixed in Legacy / Trac reviewer::nickm in Legacy / Trac severity::normal in Legacy / Trac status::closed in Legacy / Trac type::defect in Legacy / Trac version::sbws 1.0.2 in Legacy / Trac labels
added component::core tor/sbws in Legacy / Trac milestone::sbws: 1.1.x-final in Legacy / Trac owner::juga in Legacy / Trac parent::28663 in Legacy / Trac points::1 in Legacy / Trac priority::medium in Legacy / Trac resolution::fixed in Legacy / Trac reviewer::nickm in Legacy / Trac severity::normal in Legacy / Trac status::closed in Legacy / Trac type::defect in Legacy / Trac version::sbws 1.0.2 in Legacy / Trac labels
This ticket probably needs to be done at the same time as legacy/trac#28865 (moved).
Replying to teor:
After sbws queues an
AsyncResult
, it will wait forever for the result to be ready: https://github.com/torproject/sbws/blob/ee64d76df54ceb3a3c9e1e2a797fd70d68bb0035/sbws/core/scanner.py#L359-L364If at least one result hangs, then sbws will hang, because
AsyncResult.ready()
does not have a timeout.In theory this won't happen, since both circuits and requests have a timeout. Also the sleep happen in the main thread, not the thread that is getting the results
Instead, sbws should call
AsyncResult.wait([timeout])
on each result, after callingpool.apply_async()
on a large number of results.See https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.AsyncResult.wait
I'll try this anyway
I looked more at this.
wait
is the method of anevent
(https://github.com/python/cpython/blob/master/Lib/multiprocessing/pool.py#L662), and it blocks (https://docs.python.org/3/library/threading.html?highlight=threading#threading.Event.wait)sleep
is not blocking because happen in the main thread. Probably the cause (or at least the main one) is legacy/trac#28897 (moved).Replying to juga:
I looked more at this.
wait
is the method of anevent
(https://github.com/python/cpython/blob/master/Lib/multiprocessing/pool.py#L662), and it blocks (https://docs.python.org/3/library/threading.html?highlight=threading#threading.Event.wait)a maximum of
timeout
.get
with timeout useswait
, and that's what we want since pending results that didn't trigger either callback or callback error have not been gotten, andget
give us either the exception or the value.After merging legacy/trac#28932 (moved), this is an improvement, can wait to 1.1 milestone.
https://github.com/torproject/sbws/pull/341
Trac:
Status: new to needs_review
Milestone: sbws: 1.0.x-final to sbws: 1.1.x-final
Cc: juga, teor to N/ABecause of legacy/trac#28865 (moved), this needs to be re-thought, to don't make it yet more complicated.
Trac:
Status: needs_review to needs_revisionSolving also legacy/trac#28865 (moved): https://github.com/torproject/sbws/pull/342 Moving to 1.1 milestone since legacy/trac#28932 (moved) check progress and logs error.
Trac:
Status: needs_revision to needs_reviewTrac:
Reviewer: N/A to nickmThis code looks reasonable to me. If you have tested it and works for you, then I'd say "merge it!" :)
For my own curiosity: why is the time cutoff set to "the number of releys to measure multiplied by TIMEOUT_MEASUREMENTS (around 90mins)"?
Trac:
Status: needs_review to merge_readyReplying to nickm:
For my own curiosity: why is the time cutoff set to "the number of releys to measure multiplied by TIMEOUT_MEASUREMENTS (around 90mins)"?
Well spotted. Actually when measuring 300 relays it takes around 90min. In the case a relay doesn't get measured in 3min, there's no need to wait that 300 relays take each of them 3min, it can just finish when there hasn not been any relay measured in that time. I commited a fixup to do that, tested the whole loop in the public network and works.
- Trac closed
closed
- Trac changed time estimate to 8h
changed time estimate to 8h
- teor mentioned in issue legacy/trac#28865 (moved)
mentioned in issue legacy/trac#28865 (moved)
- Trac moved from legacy/trac#28864 (moved)
moved from legacy/trac#28864 (moved)
- Trac removed 1 deleted label
removed 1 deleted label