Implement proposal 285: Directory documents should be standardized as UTF-8
Proposal 285 describes a migration process to go from the present state to a state in which all directory documents are standardized as UTF-8; @asn is interested in implementing it.
To get this done, you'll probably want to follow something like the following steps:
-
Verify that the description of UTF-8 in the proposal matches the implementation of UTF-8 by other systems that we interoperate with. -
Verify that our implementations of string_is_utf8()
andstring_is_utf8_no_bom()
do what we want. -
In string_is_utf8_no_bom
, use fast_memeq instead ofstrcmpstart
. Thestrcmpstart()
function is for nul-terminated strings. -
Authorities should reject all descriptors and extrainfos that aren't valid UTF-8. (And I think they do.) -
Begin rejecting descriptors, extrainfo docs, microdescriptors, authority certificates, votes, and consensus documents if they are not UTF-8. - (There was a suggestion in the proposal's §2.2 that we should treat bridge descriptors differently, since bridge descriptors are not relayed through the authorities. But we can skip that suggestion and just require UTF-8 in descriptors now, since there is not much point in a bridge trying to fingerprint a client's version.)
-
HSDirs should reject onion service descriptors that are not UTF-8. -
There should be an parameter in the consensus that controls whether the encrypted part of onion service descriptors should be UTF-8, as described in §2.1. -
When generating any of the above documents, report a bug and refuse to upload them if we have accidentally generated something that isn't UTF-8. -
Update dir-spc.txt to describe the new requirements.