HS Descriptors should only contain UTF-8
In #18938 (moved), we discovered some relays publishing platform lines or extrainfo documents with non-ASCII characters.
This seems to occur due to memory corruption.
So let's make sure that a non-ASCII UTF-8 check is done on:
- hidden services: unencrypted or encrypted descriptor
- HSDirs: encrypted descriptor
- clients: unencrypted or encrypted descriptor
See https://gitweb.torproject.org/torspec.git/tree/proposals/285-utf-8.txt#n45
This might require a spec change - it's likely implicit in the spec already.
For completeness, the set of printable ASCII characters is defined in dir-spec.txt ... Where I assume "printing ASCII" means "space to tilde", but we should also clarify that in the torspec.