The first step here is confirming that indeed the rolling circuit cell digest covers all of the bytes in the cell, not just the "in use" (rh->length) ones. I think we're good there: see the two calls to crypto_digest_add_bytes() in relay_crypto.c that have CELL_PAYLOAD_SIZE as their size argument.
My task26846-demo branch has two commits that are an example of what we could do here.
If we like this direction, they'll still need unit tests and a changes file (at least).
We might also want to make a consensus parameter for the "how often to leave a byte empty" choice, but also maybe that's just overengineering things.
And maybe flipping coins every time we consider packaging is expensive -- we could probably get away with some cheaper randomness if we wanted. But I think we should only do that if we are pushed into it, since cheaper randomness is so easy to regret.
I also thought about keeping a "how many cells have we packaged on this circuit" counter and then just doing some sort of mod operation to decide which ones to decrement. But I decided that doing it probabilistically is better, because right now we pad short payloads with 0 bytes (#26228 (moved)), and somebody should check my logic but I think a rare but unpredictably short payload with a zero is enough to accomplish our goal here -- that is, we don't actually need a random byte, we just need to have it be unpredictable when the 0 will appear.
and somebody should check my logic but I think a rare but unpredictably short payload with a zero is enough to accomplish our goal here -- that is, we don't actually need a random byte, we just need to have it be unpredictable when the 0 will appear.
Ok, I have two counterexamples that make me resume thinking we need the unused portion of the payload to be random. The first counterexample is sort of ugly and it's unclear how practical it is as an attack, but the second counterexample is elegant and simple.
The first one is: what if the source of bytes at the destination side sends bytes out 400 at a time? Then we would always package them up 400 at a time, and we would never trigger the "decrement the length" function, because we would always only have partially full payloads anyway. Now, this isn't so bad for two reasons. One is that if you wanted to be sure to separate your 400 byte chunks so they go in different cells, you're really slowed down on the rate you can generate new cells. And the other related reason is that maybe the exit will package them in a different way than you expect, like 200 at a time every so often, or it will clump together two of your 400's sometimes, or etc. But still, I think this attack could work 'sometimes', and that is worrisome. It would be fixed by making the unused portion of the relay data cell payload random.
And the second attack is: what if the destination is sending a stream of 0 bytes? Then our payloads consist of NUL bytes, and every so often we make a short payload and put a NUL at the end. And our security relies on the attacker not being able to guess which bytes will be padding ("0") and which ones will be in-use-payload (also "0"). Oops. This attack would also be fixed by making the unused portion of the relay data cell payload random.
I think this logic argues for another subticket on #26288 (moved): "randomize the unused part of relay payloads".
Let's review these tickets at the next meeting using our 041-proposed process.
They're on the roadmap, so the review should focus on ticket size and team capacity (and sponsor expectations).
We reopened this ticket because it was never done.
The current code in master will fill unused bytes with randomness, but it will never introduce any extra unused bytes in cells. So if cells are always full, there will be no randomness, which will make the entire flow predictable to somebody who already knows what bytes they'll be receiving from the other side.
We need some feature, like the one in my branch from comment:3, that sometimes leaves a bit of space in cells.
If you base it on my branch above, which I think is still plausible since it's so simple, be sure to note that dgoulet stuck a "don't mess with the first 4 unused bytes" rule in there, so we'd need to open up at least 5.
Another option, which is less hacky (less probabilistic) but would require more code and more state, would be to have a counter (on each circuit) of how many full cells we've sent, and if we ever send a non-full cell then reset it, and if it ever reaches 1000, make some space in that cell.
And as a final note, it's not actually required (from a technical coordination perspective) that we get this feature in to this release, since we could in theory add this "make some space" logic in a future Tor, and that would be the Tor that finishes doing prop289 properly. But "why not now, it's not that big" is a solid reason for doing it now too.
I've done a draft here in a branch called ticket26846. What do you think of this approach, Roger? If you like the general idea, I'll clean it up a little more and add some tests.
Looks reasonable! Sad to use so much more code and per-circuit state but, everything seems to involve adding more and more code and state these days. :)
Cleaning up with attention to detail seems smart, e.g. in the commit message it says 28646, which is a different ticket.
One little nit: connection_edge_get_inbuf_bytes_to_package() in this branch returns 0 if !package_partial and n_available < RELAY_PAYLOAD_SIZE, even if we were planning to send a shorter payload and n_available is enough to send it. Not the end of the world I guess, but kind of weird, and avoided in my earlier branch. Could be fixed with a bit of gymnastics (like "reduce length at the top of the function but don't actually commit to changing state until later in the function").
I also spent a while trying to convince myself that there aren't situations where connection_edge_get_inbuf_bytes_to_package() can return 0 in this branch yet we changed the state. Would probably be smart to clearly delineate, in that function, the point at which we've committed to send a cell. (I think it is right after the final opportunity to "return 0;".)