Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • Tor Tor
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 344
    • Issues 344
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 11
    • Merge requests 11
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Artifacts
    • Schedules
  • Packages and registries
    • Packages and registries
    • Container Registry
    • Model experiments
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • The Tor Project
  • Core
  • TorTor
  • Issues
  • #40194

provide relay health prometheus metrics via MetricsPort

tor recently got support for MetricsPort in v0.4.5.1-alpha (#40063 (closed)).

For more context to this feature request see: https://lists.torproject.org/pipermail/tor-dev/2019-February/013655.html

I'm proposing to add the following prometheus metrics (incl. labels), all metrics show absolute counters since tor started: (feel free to add constraints like reducing granularity of counters or only updating counters once every x minutes for safety reasons)

on exit relays (DNS related metrics)

  • tor_relay_exit_dns_errors{reason="timeout"}
  • tor_relay_exit_dns_errors{reason="SERVFAIL"}
  • tor_relay_exit_dns_errors{reason="REFUSED"}

DNS RCODEs: https://www.iana.org/assignments/dns-parameters/dns-parameters.xhtml#dns-parameters-6

I'm not sure if this is even visible to tor (unless ServerDNSResolvConfFile is used) but if possible this kind of data would ideally be available for each resolver IP so the relay operator can detect and disable the faulty resolver:

  • tor_relay_exit_dns_errors{reason="timeout", resolver="1.1.1.1"}

  • tor_relay_exit_dns_errors{reason="timeout", resolver="8.8.8.8"}

  • ...

  • tor_relay_exit_maxdnsqueriespercircuit max amount of DNS queries caused by a single circuit since tor started

  • exit stats as defined in (if enabled in torrc) https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n1197

other relay metrics

  • tor_memory_bytes total amount of memory used by the tor process in bytes
  • tor_relay_dos_circuitskilledwithtoomanycells
  • tor_relay_dos_circuitsrejected
  • tor_relay_dos_markedaddress
  • tor_relay_dos_connectionsclosed
  • tor_relay_dos_singlehopclientsrefused
  • tor_relay_dos_introduce2rejected
  • tor_relay_opencircuits currently open tor circuits
  • tor_relay_connections{v="v1",direction="initiated"}
  • tor_relay_connections{v="v1",direction="received"}
  • tor_relay_connections{v="v2",direction="initiated"}
  • tor_relay_connections{v="v2",direction="received"}
  • tor_relay_connections{v="v3"...
  • tor_relay_traffic{direction="sent"} total traffic sent in bytes
  • tor_relay_traffic{direction="received"} total traffic received in bytes
  • tor_relay_circuit_handshakes{proto="TAP"}
  • tor_relay_circuit_handshakes{proto="NTor"}
  • tor_relay_uptime tor process uptime in seconds
  • tor_relay_version used tor version
  • tor_relay_version_recommended boolean to indicate whether the used version is recommended
  • ...

Flags

  • tor_relay_flag_stable
  • tor_relay_flag_guard
  • tor_relay_flag_exit
  • tor_relay_flag_...

Some more:

  • amount of closed/failed circuits broken down by their reason value https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt#n1402

  • amount of closed/failed OR connections broken down by their reason value https://gitweb.torproject.org/torspec.git/tree/control-spec.txt#n2202

  • cell stats (if enabled in torrc) as defined in: https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n1137

Edited Jan 16, 2022 by nusenu
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking