prop224: HS descriptors do wasteful double-base64 encoding
In #21334 (moved) we implement the new prop224 HS descriptor format that does the double-layer encryption to implement the client auth functionality.
As part of that design, we first base64 the ciphertext of the inner layer of the descriptor. Then when we create the outer descriptor layer, we base64 the ciphertext of the middle layer which also includes the inner layer.
This results in a construction as follows:
middle_layer = base64(encrypt(client_auth_data + base64(encrypt(inner_layer))))
outer_layer = header + middle_layer.
Notice that in the above construction we actually base64 the inner layer twice which is wasteful. During design and development we glossed over this fact thinking that it's not that big of a waste, and since we are already padding the whole
middle_layer to 10k bytes it's fine (a typical default size for
middle_layer is about 4k bytes).
However, Nick brought this topic again during review and we decided to open a ticket to discuss this, since in theory we could define some sort of binary format for the middle layer and avoid the wasteful double base64.
The pros of this would be that we could fit more data in the middle layer. I estimated that we could fit an extra 1k bytes of data by addressing this. Based on some initial calculations this means that we could fit an extra intro point on the default 10k bytes descriptor, or maybe another block of 16 authed clients (if we are lucky since that's about 1.2k bytes).
The negative of this would be that we would have to go back into design stage to spec the binary format, and then we would need to write the code to implement that; whereas now we are using the same decoding function for both layers. That's basically a simple matter of programming and time and I'm definitely willing to do it if we decide it's the right thing to do.
We discussed this with David in IRC and decided that given the current state of development we should perhaps roll with the current design. That's because only a small number of hidden services would benefit from this change since default descriptors (of 3 intro points and no client auth) are just 4k bytes in size and they get padded to 10k bytes anyway. Only descriptors with many intro points and client auth data that are about 11k bytes would benefit from this change since they wouldn't need to get padded to 20k bytes, and they could actually fit in 10k.
Opening this ticket seems like a good idea since 0.3.1 is the time to do this change if we ever want to; so that we don't feel silly in the future.