Skip to content

Recursion limit problem in TaskManager

This issue was automatically migrated from github issue https://github.com/TheTorProject/ooni-probe/issues/296.

The task manager is currently designed to recursively call _fillSlots that will then call _run that then again will call _fillSlots on success or failure. This means that when there are a lot of tasks failing very quickly it is very likely that the default python recursion limit will be overcome (1000).

To reproduce this bug you can try and run a test with a long invalid input for example http_requests:

ooniprobe blocking/http_requests -f data/complete.deck

Note that the fact that this test fails is correct, however it fails in a surprising manner:

Unhandled error in Deferred:
Unhandled Error
Traceback (most recent call last):
  File "/ooni-probe/ooni/managers.py", line 153, in _failed
    super(LinkedTaskManager, self)._failed(result, task)
  File "/ooni-probe/ooni/managers.py", line 44, in _failed
    task.done.errback(failure)
  File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/internet/defer.py", line 423, in errback
    self._startRunCallbacks(fail)
  File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/internet/defer.py", line 490, in _startRunCallbacks
    self._runCallbacks()
--- <exception caught here> ---
  File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/internet/defer.py", line 577, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/ooni-probe/ooni/director.py", line 188, in measurementFailed
    log.msg("Failed doing measurement: %s" % measurement)
  File "/ooni-probe/ooni/utils/log.py", line 62, in msg
    print "%s" % msg
  File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/python/log.py", line 505, in write
    msg(message, printed=1, isError=self.isError)
  File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/python/threadable.py", line 53, in sync
    return function(self, *args, **kwargs)
  File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/python/log.py", line 185, in msg
    actualEventDict = (context.get(ILogContext) or {}).copy()
  File "/.virtualenvs/ooni-probe/lib/python2.7/site-packages/twisted/python/context.py", line 121, in getContext
    return self.currentContext().getContext(key, default)
exceptions.RuntimeError: maximum recursion depth exceeded

I think this bug is perhaps a good opportunity to discuss some possible refactoring of the task scheduler related code. It may be a good idea to draw some inspiration from: https://github.com/terrycojones/txrdq

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information