As of 2016-09-05 09:00 there are still about 37% (3233 out of 8592) of onionoo records without proper reverse DNS entries.
I assumed that the problem resides on the used upstream DNS server (more than the method to query it, since I thought that is a pretty standard procedure).
That is why I asked which DNS server you (onionoo.tpo) use:
When discussing this bug, it was pointed out on IRC that these examples that were given in the original report are misconfigured. While the reverse dns for these IPs is set, forward entries don't exist. Perhaps this is an additional check that onionoo does (and I think it probably should do). The host uses hetzner's nameservers, but they can resolve the reverse dns just fine.
The TTL of the entries of those hosts is set to something very unreasonably low. If you are in any way responsible for the relays mentioned above, please fix their DNS config. This is not to say that there isn't a bug still somewhere in onionoo, but afaict, not for the hosts mentioned above.
The TTL of the entries of those hosts is set to something very unreasonably low. If you are in any way responsible for the relays mentioned above, please fix their DNS config.
While the reverse dns for these IPs is set, forward entries don't exist. Perhaps this is an additional check that onionoo does (and I think it probably should do).
If onionoo requires PTR and A DNS records to match please add that information to the description, currently it does not mention that in any way:
host_name string optionalHost name as found in a reverse DNS lookup of the relay IP address. This field is updated at most once in 12 hours, unless the relay IP address changes. Omitted if the relay IP address was not looked up or if no lookup request was successful yet.
I care about rDNS records (even if PTR and A records don't match) because I'd like to use them for group detection for ornetradar and it is apparently very common that providers do not have A records for PTR records.
https://lists.riseup.net/www/info/ornetradar
new field: dns_ptr Description: DNS PTR record of the relay's primary IP address. This field is updated at most once in 12 hours, unless the relay's primary IP address changes. Omitted if no lookup request was successful yet.
change the host_name implementation by doing what the current description says (#25551 (moved))
I really don't like the dns_ptr name for the field. I think Onionoo works at a higher level than that. I think I definitely prefer the unverified_host_name and host_name approach. To get the behaviour you're looking for in JavaScript would be:
dns_ptr = relay.get('host_name', None) or relay.get('unverified_host_name', None)
The approach where you have dns_ptr and host_name means that when correctly configured, the host name would end up included twice. I just don't feel that that's the most elegant solution.
I'll take a look at a patch for the spec later this week, but to outline:
For host_name we implement the spec as it is (actually omitting the field instead of returning IP addresses).
For unverified_host_name, this will include the DNS PTR record's name. It would be updated at the same time as the host_name record. Omitted if the host name was verified by looking up an A record, or if no PTR record was found.
The DNS specification does not actually prohibit the presence of multiple PTR records. ):
This means it can be perfectly valid for an IP address to have multiple reverse names which could all be validated using forward names successfully.
I have found at least 3 relays that have more than one PTR record that validate with a forward record.
Alternative proposal:
For host_name we implement the spec as it is (actually omitting the field instead of returning IP addresses), but also deprecate the field.
Add a new verified_host_names field with an array of strings, this will include the DNS PTR record's names. It would be updated at the same time as the host_name record. Omitted if the host name was verified by looking up an A record, or if no PTR record was found.
Add a new unverified_host_names field with an array of strings, this will include the DNS PTR record's names. It would be updated at the same time as the host_name record. Omitted if the host name was verified by looking up an A record, or if no PTR record was found.
It would be possible to have either, both, or neither verified and unverified host names for each relay.
Minor suggestion: just leave the "host_name" field unchanged, deprecate it, and get rid of it two months later. Reason: changing that field would require a minor protocol version bump, and whoever cares this much about host names will start looking at the new fields immediately anyway.
In this case, is it worth correcting the spec to indicate that the field is always returned and that it will just contain the IP address if no lookup was successful?
since it means more work for me, I'd like to understand your motivation for returning the IP instead of null if there is no PTR record.
compatibility with a bug? since it will be a new field there is no need for compatibility?
I do not yet have a repository to push to on git.tpo, but to avoid blocking on this (#26643 (moved)) I have pushed my Onionoo branch to GitHub and opened a [request] against master which should allow for review.
In the protocol version summary: "Added a new optional X and Y fields". Unless the current text is correct English. To me it reads as if the Y part came in later and the rest of the sentence stayed the same.
For the deprecated field and the two newly added field, please include a date when they will be removed or have been added, respectively. Git history contains some examples.
The specification of "unverified_host_names" does not make it entirely clear whether a host name can show up in both newly added fields. It's the "regardless" part that threw me off. Maybe there's a way to make this clearer?
I do not yet have a repository to push to on git.tpo, but to avoid blocking on this (#26643 (moved)) I have pushed my Onionoo branch to GitHub and opened a [request] against master which should allow for review.
I took the opportunity and used GitHub's review capabilities. Please take a look there.
Great stuff! Really looking forward to merging this soon. Setting to needs_revision for now. Thanks!
All changes look good, with the only exception being the DomainNameSystem class that should ideally not be a singleton. See also my comment on GitHub. Setting back to needs_revision just for that one. Thanks!
There's another issue that I did not fix yet: domain names always end with a dot, as in "onionoo.torproject.org.". Can you find out why this is the case and fix it?
In general, please avoid rebasing branches under review. Just add fixup or squash commits until we're done, and I'll make sure the end result is squashed and rebased.
There's another issue that I did not fix yet: domain names always end with a dot, as in "onionoo.torproject.org.". Can you find out why this is the case and fix it?
This is because domain names always end with a dot, to indicate that they are absolute domain names. I'm sure I tested this and the current behaviour was to include the dot, but I must have been testing against the wrong thing as I can see now that we don't currently include the dot. I'll add a fixup commit to strip them.
In general, please avoid rebasing branches under review. Just add fixup or squash commits until we're done, and I'll make sure the end result is squashed and rebased.
Alright! Squashed those fixup commits and pushed to master. Not yet released, nor deployed, which is something we should make plans for at the next team meeting. Not yet closing, because we need to merge the metrics-web patch as part of the release. Thanks!