provide DNS health metrics for tor exit relay operators

https://lists.torproject.org/pipermail/tor-dev/2019-February/013655.html

every now and then I'm in contact with relay operators about the "health" of their relays. Following these 1:1 discussions and the discussion on tor-relays@ I'd like to rise this issue with you (the developers) with the goal to help improve relay operations and end user experience in the long term:

Current situation: Arthur Edelstein provides public measurements to tor exit relay operators via his page at: https://arthuredelstein.net/exits/ This page is updated once daily.

the process to use that data looks like this:

  • first they watch Arthur's measurement results
  • if their failure rate is non-zero they try to tweak/improve/change their setup
  • wait for another 24 hours (next measurement)

This is a somewhat suboptimal and slow feedback loop and is probably also less accurate and less valuable data when compared to the data the tor process can provide.

Suggestion for improvement:

Exposes the following DNS status information via tor's controlport to help debug and detect DNS issues on exit relays:

(total numbers since startup)

  • amount of DNS queries send to the resolver
  • amount of DNS queries send to the resolver due to a RESOLVE request
  • DNS queries send to resolver due to a reverse RESOLVE request
  • amount of queries that did not result in any answer from the resolver
  • breakdown of number of responses by response code (RCODE) https://www.iana.org/assignments/dns-parameters/dns-parameters.xhtml#dns-parameters-6
  • max amount of DNS queries send per curcuit

If this causes a significant performance impact this feature should be disabled by default.