Upgrading to 0.7.x is probably not a bad idea, but I am confused -- the scanner seems to be making progress:
{{
DEBUG[Sat Aug 27 10:45:39 2011]:Starting slice number 3
DEBUG[Sat Aug 27 11:11:49 2011]:Starting slice number 4
DEBUG[Sat Aug 27 12:15:31 2011]:Starting slice number 5
DEBUG[Sat Aug 27 13:19:29 2011]:Starting slice number 6
DEBUG[Sat Aug 27 14:26:42 2011]:Starting slice number 7
DEBUG[Sat Aug 27 15:04:15 2011]:Starting slice number 8
DEBUG[Sat Aug 27 15:36:56 2011]:Starting slice number 9
DEBUG[Sat Aug 27 16:23:38 2011]:Starting slice number 10
DEBUG[Sat Aug 27 16:49:48 2011]:Starting slice number 11
DEBUG[Sat Aug 27 17:47:58 2011]:Starting slice number 12
DEBUG[Sat Aug 27 18:29:50 2011]:Starting slice number 0
DEBUG[Sat Aug 27 19:43:41 2011]:Starting slice number 1
}}
I also checked through the logs and did not see anything unusual (did not appear stalled). Do we have any other theory as to what is happening?
This could be caused by the new ratio grouping (legacy/trac#3444 (moved)). moria is also not making progress on occasion, despite taking continued measurements and not crashing.
Basically, grouping by ratio may cause higher churn in the slices, since it depends on both measured value and observed value. If the churn rate is faster than slice progress, no progress will be made.
I also flipped FetchDirInfoEarly and FetchDirInfoExtraEarly in c86c1e13. Since they will cause consensus fetches every hour, this may be even worse.
We could make the slices smaller to ensure completion, but I think nothing currently ensures that we will get at least one exit node in a slice. Perhaps we should still make them smaller, and also fix the selection to ensure exits anyway...
My cron has stopped mailing me complaints like this. I'm not sure if that means they're still happening and I've stopped being told, or if they magically disappeared.