FWIW, though I don't generally object to belt-and-suspenders engineering, I'm not 100% sure of this approach; if we think there's any chance that future outputs of our PRNG can be predicted from our past PRNG outputs, we should probably throw away the PRNG and use a good one, right? (Or for that matter, we should revise openssl to stop exposing PRNG outputs in that case too.)
Yes, we should always use a PRNG that's unpredictable (and switch as soon as we suspect it's not). On that topic, OpenSSL still uses SHA-1 internally for its PRNG, and we're trying to phase out SHA-1. But I'm not sure if known SHA-1 vulnerabilities affect its use in the OpenSSL PRNG.
Hashing PRNG output helps protect previous random outputs, if we discover later on that our PRNG was more predictable than we thought. (It works kinda like forward secrecy for random numbers.)
On the other hand, hashing PRNG output could introduce vulnerabilities if any bits of the hash function's output are correlated with each other.
It contains the modifications to crypto_rand, and new benchmarks comparing crypto_rand (hashed PRNG output) with crypto_rand_raw (PRNG output).
While I don't think PRNG performance is critical-path, on typical random data requests, random performance increases up to 48% at smaller sizes (4 bytes) due to the hash output buffer, and performance decreases up to 27% at larger sizes (32 bytes).
If we do choose to go ahead with this, we should run the unit tests on both crypto_rand_raw and crypto_rand. I can update the existing unit tests to do that, and I'd like to include the unit tests in legacy/trac#17697 (moved) as well. But I'll wait until we decide we want this change.
nickm, ioerror: I don't know whether we want to go ahead with hashing PRNG output.
(Rather than, say, using a PRNG that's not vulnerable like Dual EC was.)
Are you happy for me to close this ticket, or is it a feature we need?
Trac: Cc: N/Ato nickm, ioerror Status: needs_review to needs_information
I asked a cryptographer sitting next to me and they suggested that my suggested approach is reasonable. Hash the PRNG output. Or more directly: "Never reveal pure randomness outputs."
For example, what happens when someone uses RDRAND internally? In theory, nothing in our stack does that; in practice, I guess that is incorrect. Something in the kernel, for example? Some weird OpenSSL builds or something?
Not to say that RDRAND is backdoored but that if it was, I think a hash would prevent what we currently understand to be a way to exploit a Dual-EC like backdoor.
I asked a cryptographer sitting next to me and they suggested that my suggested approach is reasonable. Hash the PRNG output. Or more directly: "Never reveal pure randomness outputs."
Could you ask the cryptographer to jabber or IRC or email me quickly or something? I don't think the reasoning makes sense in this case, for a few of reasons:
We know what the PRNG is.
OpenSSL already exposes the PRNG outputs (in ClientHello and ServerHello, at least), and we can't stop it from doing that without replacing the PRNG.
The PRNG is under our control. The answer to a questionable PRNG would seem to be replacing it with a non-broken PRNG.
For example, what happens when someone uses RDRAND internally?
We detect attempts to replace the OpenSSL PRNG with the RDRAND engine, or any other engine; see legacy/trac#10402 (moved).
In theory, nothing in our stack does that; in practice, I guess that is incorrect. Something in the kernel, for example? Some weird OpenSSL builds or something?
Not to say that RDRAND is backdoored but that if it was, I think a hash would prevent what we currently understand to be a way to exploit a Dual-EC like backdoor.
It would only prevent it if openssl itself stopped exposing direct outputs, right?
I asked a cryptographer sitting next to me and they suggested that my suggested approach is reasonable. Hash the PRNG output. Or more directly: "Never reveal pure randomness outputs."
Could you ask the cryptographer to jabber or IRC or email me quickly or something? I don't think the reasoning makes sense in this case, for a few of reasons:
We know what the PRNG is.
I guess "never" was the operative word.
OpenSSL already exposes the PRNG outputs (in ClientHello and ServerHello, at least), and we can't stop it from doing that without replacing the PRNG.
Yes, though the question is - do we want our protocol to have the same mistake? I think no.
The PRNG is under our control. The answer to a questionable PRNG would seem to be replacing it with a non-broken PRNG.
We hope.
For example, what happens when someone uses RDRAND internally?
We detect attempts to replace the OpenSSL PRNG with the RDRAND engine, or any other engine; see legacy/trac#10402 (moved).
If that fails, do we fail closed or open?
In theory, nothing in our stack does that; in practice, I guess that is incorrect. Something in the kernel, for example? Some weird OpenSSL builds or something?
Not to say that RDRAND is backdoored but that if it was, I think a hash would prevent what we currently understand to be a way to exploit a Dual-EC like backdoor.
It would only prevent it if openssl itself stopped exposing direct outputs, right?
If openssl did things correctly, I'd still suggest our protocol should not reveal things. An extra hash shouldn't hurt and it may mitigate issues, especially if our analysis were incorrect. Even if we properly prevent RDRAND's use with OpenSSL, if we ever get random data from the kernel, I'm really unsure that we can prevent that from using RDRAND. I guess I'd rather be safe than sorry and I'm not convinced I understand every aspect enough.
Additionally, alternative implementations of Tor will be more fragile than Tor.
Okay, I still think I'm right, but after an IRC discussion with ioerror, I'm a bit concerned about the number of smart cautious people who disagree with me.
I think our best move forward here is:
Never do anything raw with the system entropy, except feeding it into a PRNG or a hash.
Whenever the PRNG might be exposed, hash it first. Split crypto_rand() into crypto_rand() and crypto_rand_exposed().
Amend specifications with a note on PRNG usage, describing the above.
feature17694_strongest_027 starts this, based on the code for legacy/trac#17686 (moved). I suggest it for inclusion in 0.2.8 only. I'm going to extend it to know about other prngs.
feature17694_strongest_027 starts this, based on the code for legacy/trac#17686 (moved). I suggest it for inclusion in 0.2.8 only. I'm going to extend it to know about other prngs.
I would really like to see comments added to crypto_strongest_rand() since this is not that obvious to get at first glance. For instance:
Why is inp twice the size and the second half is randomized differently? (I assume to mix different entropy source but would be nice that it's explain why we do that).
Why do we tor_assert(0) instead of clean exit?
The rest lgtm! I'm those in favor of this as well which will simplifies things for prop#250 that right now explicitely hash the return value of crypto_rand() as a precaution of not leaking any raw bytes to the wire.
feature17694_strongest_027 starts this, based on the code for legacy/trac#17686 (moved). I suggest it for inclusion in 0.2.8 only. I'm going to extend it to know about other prngs.
This patch only hashes the entropy used in keys.
I don't think this achieves the overall goal: "make sure we never leak raw PRNG output to the network".
We can easily leak raw PRNG output via salts, nonces and other randomly chosen values that are sent on the wire.
Even our "random" choices of relays could leak some bits.
I don't think this achieves the overall goal: "make sure we never leak raw PRNG output to the network".
We can easily leak raw PRNG output via salts, nonces and other randomly chosen values that are sent on the wire.
Even our "random" choices of relays could leak some bits.
At some point this becomes rather silly, not to mention expensive, to the point where "We should ditch OpenSSL's CSPRNG, if we don't trust it if state gets exposed somehow instead of always passing output through extra hash functions" becomes compelling.
IMO that point now has been reached. Others are free to disagree with me.
I don't think this achieves the overall goal: "make sure we never leak raw PRNG output to the network".
We can easily leak raw PRNG output via salts, nonces and other randomly chosen values that are sent on the wire.
Even our "random" choices of relays could leak some bits.
At some point this becomes rather silly, not to mention expensive, to the point where "We should ditch OpenSSL's CSPRNG, if we don't trust it if state gets exposed somehow instead of always passing output through extra hash functions" becomes compelling.
Depending on how it's implemented, the hashing has a performance cost of about -40% (if we hash an entire buffer, then use it for multiple small random calls) to +30% on Tor's typical input sizes.
IMO that point now has been reached. Others are free to disagree with me.
To be clear, I'm agnostic on hashing PRNG output. But if we want to prevent leaking PRNG bits, we should hash all the bits that could be sent out from or be observed from outside the process.
Depending on how it's implemented, the hashing has a performance cost of about -40% (if we hash an entire buffer, then use it for multiple small random calls) to +30% on Tor's typical input sizes.
For the former case, that reads suspiciously like "re-implement libottery as a Hash_DRBG", sitting on top of another hash based CSPRNG (OpenSSL's).
IMO that point now has been reached. Others are free to disagree with me.
To be clear, I'm agnostic on hashing PRNG output. But if we want to prevent leaking PRNG bits, we should hash all the bits that could be sent out from or be observed from outside the process.
I'm mostly agnostic here. I think anything more than what nickm's branch does warrants re-evaluating why we are doing this, and if there's a way to get what we want efficiently (since RAND_bytes() is hashing entropy from the system entropy source. I'm not convinced that's not good enough, at all.).
(After a quick look over the code, I do agree with teor that the vast majority of our PRNG usage should be regarded as "potentially visible to an observer.")
../src/common/crypto.c: In function 'crypto_strongest_rand':../src/common/crypto.c:2530:1: error: stack protector not protecting local variables: variable length buffer [-Werror=stack-protector]cc1: all warnings being treated as errors
It looks like using SHA512_DIGEST_LENGTH indirectly isn't working. Why wasn't it used directly? And why is SHA512_DIGEST_LENGTH used while the rest of the Tor code base uses DIGEST512_LEN?
Better now?
Jenkins seems to be happy, so yes. Thank you.
I did a review on the crypto_strongest_rand function and i believe the out addition and out_len subtraction in the else block can be removed. (At first i thought this would cause a buffer overflow until i saw the break statement, but it still causes an integer overflow in out_len).