Our SOCKS hostname validation is overly lax.
This applies to both SOCKS4a and SOCKS5.
What we check (for rejection):
!tor_strisprint(req->address) || strchr(req->address,'\"')
What the RFC says (RFC 1035 - 2.3.1. Preferred name syntax):
<domain> ::= <subdomain> | " " <subdomain> ::= <label> | <subdomain> "." <label> <label> ::= <letter> [ [ <ldh-str> ] <let-dig> ] <ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str> <let-dig-hyp> ::= <let-dig> | "-" <let-dig> ::= <letter> | <digit> <letter> ::= any one of the 52 alphabetic characters A through Z in upper case and a through z in lower case <digit> ::= any one of the ten digits 0 through 9
Changing the check to enforce isalnum(), '.' and '-' would cut out a lot more things that are flat out invalid.
The one concern (raised by nickm) is "are there clients that send IPv6 addresses as FQDNs"? This is plausible for SOCKS4a (because it's the only way to get IPv6 support to work), and a sign of broken client that deserves to be laughed at for SOCKS5. Calling
tor_inet_pton() on the FQDN before filtering will cover this case (and is probably something we should do anyway for SafeSocks).
This may be another legacy/trac#11138 (moved) will fix it sort of issue, though the changes are likewise trivial to do.