non-public relay health metrics for operators
Compared to other server daemons (webserver, DNS server, ..) tor provides little data for operators to detect operational issues and anomalies.
I'd suggest to provide the following stats via an prometheus compatible HTTP endpoint with authentication support (most of the data is already written to logfiles by default)
- total amount of memory used by the tor process
- amount of currently open circuits
- circuit handshake stats (TAP / NTor)
DoS mitigation stats
amount of circuits killed with too many cells
amount of circuits rejected
amount of connections closed
amount of single hop clients refused
amount of closed/failed circuits broken down by their reason value https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt#n1402 https://gitweb.torproject.org/torspec.git/tree/control-spec.txt#n1994
amount of closed/failed OR connections broken down by their reason value https://gitweb.torproject.org/torspec.git/tree/control-spec.txt#n2205
If this causes a significant performance impact this feature should be disabled by default.
- extra info cell stats as defined in: https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n1072