We are collecting overload data and should do an analysis for what we currently have answering questions like: Are there relays permanently overloaded? How is the overload distribution between the different overload categories we have? Are relays with particular flags (like Exit) especially affected by overload? Etc.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
I think I am almost done with all the tooling for now, so I have some graphs to share. They start from last week even though I have data from earlier already. We've been hunting Onionoo bugs since 2 weeks which might affect the number, so I am not sure whether I want to take earlier data into account. Maybe it's okay to just go with what we have right now.
overload-general for relays
overload-fd-exhausted for relays
overload-ratelimits for relays
bridge overload
There seems to be a pretty tight coupling of exit not general overload and relay general overload which is really visible on the other two relays related graphs. It's not clear why that's happening. Maybe it's due to tpo/core/tor#40491 (closed).
Once thing I want to investigate a bit further is the significant overload general drop on 10/15. Maybe that's caused by another issue with out Onionoo-based data.
Another thing which I might add to our graphs to put things into perspective is the amount of relays that are actually on Tor >= 0.4.6 given that the overload is not reported earlier. For now this boils down to:
which means roughly 1160 relays. Given the numbers in the graphs that means an overload between 20% and 25% of all relays reporting overload right now.
For bridges that's a bit harder to figure out currently due to a bug in our helper script (helper-scripts#13 (closed)).
Once thing I want to investigate a bit further is the significant overload general drop on 10/15. Maybe that's caused by another issue with out Onionoo-based data.
Does not look like an Onionoo issue. It's mainly caused by the Emerald Onion folks restarting their exit fleet from time to time. They get overloaded fast (maybe due to tpo/core/tor#40491 (closed)) and then restart their relays due to adding new servers and updating family settings etc.