Skip to content

Specify overload relay descriptors for load balancing and network health

We need to specify descriptor fields to signify when relays:

  1. Are at or near CPU overload
  2. Are at or near OOM killer invocation
  3. Are at or near connection count limits
  4. At or near their token bucket limit
  5. Accumulating too many cells in queues (circuitmux, tls outbuf, aes)
  6. Are failing too many onionskins, tls handshakes, other things?
  7. Flag/checks to signify which relays are on the same machine

The specification should only emit enough information to determine if relays are at or near various forms of overload. They should not report detailed statistics, as these may aid in DoS attacks and traffic analysis.

With this information, we will use sbws to avoid allocating extra load to these relays, as well as use these fields to report unhealthy relays on the metrics portal, and investigate other misbehavior.

I can work on this spec but I will need much input from @dgoulet.

Edited by Mike Perry
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information