Skip to content
GitLab
  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • Tor Tor
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 316
    • Issues 316
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 36
    • Merge requests 36
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • The Tor Project
  • Core
  • TorTor
  • Issues
  • #40194
Closed
Open
Created Nov 16, 2020 by nusenu@nusenu

provide relay health prometheus metrics via MetricsPort

tor recently got support for MetricsPort in v0.4.5.1-alpha (#40063 (closed)).

For more context to this feature request see: https://lists.torproject.org/pipermail/tor-dev/2019-February/013655.html

I'm proposing to add the following prometheus metrics (incl. labels), all metrics show absolute counters since tor started: (feel free to add constraints like reducing granularity of counters or only updating counters once every x minutes for safety reasons)

on exit relays (DNS related metrics)

  • tor_relay_exit_dns_errors{reason="timeout"}
  • tor_relay_exit_dns_errors{reason="SERVFAIL"}
  • tor_relay_exit_dns_errors{reason="REFUSED"}

DNS RCODEs: https://www.iana.org/assignments/dns-parameters/dns-parameters.xhtml#dns-parameters-6

I'm not sure if this is even visible to tor (unless ServerDNSResolvConfFile is used) but if possible this kind of data would ideally be available for each resolver IP so the relay operator can detect and disable the faulty resolver:

  • tor_relay_exit_dns_errors{reason="timeout", resolver="1.1.1.1"}

  • tor_relay_exit_dns_errors{reason="timeout", resolver="8.8.8.8"}

  • ...

  • tor_relay_exit_maxdnsqueriespercircuit max amount of DNS queries caused by a single circuit since tor started

  • exit stats as defined in (if enabled in torrc) https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n1197

other relay metrics

  • tor_memory_bytes total amount of memory used by the tor process in bytes
  • tor_relay_dos_circuitskilledwithtoomanycells
  • tor_relay_dos_circuitsrejected
  • tor_relay_dos_markedaddress
  • tor_relay_dos_connectionsclosed
  • tor_relay_dos_singlehopclientsrefused
  • tor_relay_dos_introduce2rejected
  • tor_relay_opencircuits currently open tor circuits
  • tor_relay_connections{v="v1",direction="initiated"}
  • tor_relay_connections{v="v1",direction="received"}
  • tor_relay_connections{v="v2",direction="initiated"}
  • tor_relay_connections{v="v2",direction="received"}
  • tor_relay_connections{v="v3"...
  • tor_relay_traffic{direction="sent"} total traffic sent in bytes
  • tor_relay_traffic{direction="received"} total traffic received in bytes
  • tor_relay_circuit_handshakes{proto="TAP"}
  • tor_relay_circuit_handshakes{proto="NTor"}
  • tor_relay_uptime tor process uptime in seconds
  • tor_relay_version used tor version
  • tor_relay_version_recommended boolean to indicate whether the used version is recommended
  • ...

Flags

  • tor_relay_flag_stable
  • tor_relay_flag_guard
  • tor_relay_flag_exit
  • tor_relay_flag_...

Some more:

  • amount of closed/failed circuits broken down by their reason value https://gitweb.torproject.org/torspec.git/tree/tor-spec.txt#n1402

  • amount of closed/failed OR connections broken down by their reason value https://gitweb.torproject.org/torspec.git/tree/control-spec.txt#n2202

  • cell stats (if enabled in torrc) as defined in: https://gitweb.torproject.org/torspec.git/tree/dir-spec.txt#n1137

Edited Jan 16, 2022 by nusenu
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
Time tracking