crypto_rand_double() should produce all possible outputs on platforms with 32-bit int

changed milestone to %Tor: unspecified in legacy/trac

Also, this is UINT_MAX_PLUS_ONE_AS_DOUBLE:

#define UINT_MAX_AS_DOUBLE 4294967296.0

Turns out that we only get 32 bits of random mantissa on 64-bit, so fixing this bug might be harder than I thought:

  /* Test for regression to bug 23061, where we produced excessively granular
   * random double values. */
  int low_bits_zero_count = 0;
  do {
    /* First check the value is within the range */
    d = crypto_rand_double();
    tt_assert(d >= 0);
    tt_assert(d < 1.0);
    /* Now check the granularity, by finding the last bit of the result.
     * doubles have a 53 bit mantissa. */
    d *= pow(2, 32);
    d -= trunc(d);
    low_bits_zero_count++;
    /* If the granularity is correct, this will fail with probability
     * 1 in 2**100, which is approximately the RAM bit error rate. */
    tt_assert(low_bits_zero_count <= 100);
  } while (d == 0.0);

See my branch rand-double-029, which is based on maint-0.2.9. It comes with unit tests and a rewritten crypto_rand_double() that uses rejection sampling. The new unit tests pass for me on macOS 10.12 x86_64 and i386.

We might want to backport this to 0.2.6, where we added laplace noise using this function. But when I tried to cherry-pick the commits, it was non-trivial. (Someone else is welcome to do this!)

The only users of crypto_rand_double() before 0.2.6 are tests, so we don't need to backport it any further.

Trac:
Actualpoints: N/A to 0.5
Keywords: N/A deleted, 026-backport-maybe, 031-backport, 029-backport, 027-backport-maybe, 030-backport, 028-backport-maybe added
Status: new to needs_review

The title is wrong, this actually affects LP64 platforms as well. And the bias affects all platforms.

Trac:
Summary: crypto_rand_double() should produce all possible outputs on 32-bit platforms to crypto_rand_double() should produce all possible outputs on platforms with 32-bit int

Where applicable, we should use the algorithm from http://allendowney.com/research/rand/ . Zack Weinberg wrote it up in C++ with https://github.com/TheTorProject/stegotorus/blob/master/src/rng.cc#L86 ; that may be worth looking at for inspiration.

Teor says (copying with permission) :

23:34 <+teor> Yes, that code is definitely better. But we'll need the following 
              tweaks:
23:35 <+teor> 1. Check if the internal double representation is the one we 
              expect, and if not, fall back to the code in my existing branch
23:35 <+teor> (There's no requirement in the C standard that double has a 
              particular binary representation)
23:36 <+teor> 2. Add a unit test that shows that *if* we're using IEEE754 
              doubles, then we can get a few values in (0, 1/2**64)
23:37 <+teor> 3. Fix the crypto_rand() mocking to stop assuming that we only 
              ask for 64 bits

Trac:
Status: needs_review to accepted
Owner: N/A to nickm

Trac:
Status: accepted to needs_revision

I have a rough attempt at implementing a rough approximation downey's algorithm in a "rand-double-029" branch in my public repo; how does the approach look? How scary are my shortcuts? Is there a better way to do this?

Yawning suggested https://github.com/libressl-portable/openbsd/blob/master/src/lib/libc/stdlib/drand48.c

It might be ok to only have 48 bits of entropy. Or we could extend that scheme to give us 52 (or >52 with subnormals).

Replying to teor:

It might be ok to only have 48 bits of entropy. Or we could extend that scheme to give us 52 (or >52 with subnormals).

Would we have to do something different to make sure the subnormal values are uniform with the others?

Replying to catalyst:

Replying to teor:

It might be ok to only have 48 bits of entropy. Or we could extend that scheme to give us 52 (or >52 with subnormals).

Would we have to do something different to make sure the subnormal values are uniform with the others?

If the architecture supports subnormals, ldexp() generates them, and + correctly handles them, then this code will work:

unsigned short rseed[4];
crypto_rand(rseed, sizeof rseed);
return ldexp((double) rseed[0], -64) +
       ldexp((double) rseed[1], -48) +
       ldexp((double) rseed[2], -32) +
       ldexp((double) rseed[3], -16);

It's much better than bit-stuffing, because it doesn't depend on how subnormals are implemented.

Modified from: https://github.com/libressl-portable/openbsd/blob/master/src/lib/libc/stdlib/drand48.c

I think the multiple ldexp() approach is more understandable than Downey's algorithm. Are we performance-critical here?

In ba199f789922484b8a2b2efd909ad3dab124dd62, why define EXPECTED_RAND_MANTISSA_BITS as a hardcoded number instead of using DBL_MANT_DIG? (assuming you've already tested for FLT_RADIX==2)

Not directly related to any of the changes, but crypto_rand_double() could use a comment clarifying its contract. That would also help us make clear what we're trying to achieve. Are the double values assumed to be uniformly distributed in the range 0.0 <= d < 1.0? How much entropy is it supposed to have? Is it supposed to produce all representable double values? (assuming only the values that are more frequent than 2**-b where b is the claimed entropy; for IEC 60559 doubles, this means we can ignore subnormals unless we're using more than a thousand bits of entropy)

Regarding the ldexp() approach, we could save operations by doing them in 32-bit chunks (because I think we can assume at least 32 bits of mantissa for a double).

Replying to catalyst:

In ba199f789922484b8a2b2efd909ad3dab124dd62, why define EXPECTED_RAND_MANTISSA_BITS as a hardcoded number instead of using DBL_MANT_DIG? (assuming you've already tested for FLT_RADIX==2)

Because we might decide that the contract is N bits of mansissa, rather than DBL_MANT_DIG. (For example, on my machine, we get 56. Before the patch, we got 31 on i386 and 32 on x86_64.)

Not directly related to any of the changes, but crypto_rand_double() could use a comment clarifying its contract. That would also help us make clear what we're trying to achieve. Are the double values assumed to be uniformly distributed in the range 0.0 <= d < 1.0? How much entropy is it supposed to have? Is it supposed to produce all representable double values? (assuming only the values that are more frequent than 2**-b where b is the claimed entropy; for IEC 60559 doubles, this means we can ignore subnormals unless we're using more than a thousand bits of entropy)

Yes, I agree we need to define the contract better. We need uniform distribution. While it would be nice to deliver all values in the range, values below 10^-13^ (or 2^-43^) are more likely to be produced by RAM bit errors than entropy, so at some point we can ignore small values.

Regarding the ldexp() approach, we could save operations by doing them in 32-bit chunks (because I think we can assume at least 32 bits of mantissa for a double).

As long as ldexp() supports 32 bits of input.

And we might need to modify the ldexp() approach, because the current one only gives 64 bits of entropy. The algorithm could be:

Generate a random 64-bit double using ldexp()
If it's zero, generate a random 64-bit double and scale down using ldexp() to the next 64 bits

Terminate when we reach our desired entropy (the contract), which should be based on:

the RAM bit error rate,
the maximum subnormal double bits (DBL_EXP_DIG + ceil(log2(DBL_MANT_DIG)).

Unless we expect really good RAM bit error rates in future RAM, I suggest we go for 64 or 128 bits of entropy, and leave it there. (Also, the likelihood of any smaller numbers is miniscule, and some calculations will turn them into zeroes or infinities anyway.)

We have a lot of options here, depending on what we want! Let's try to collect the possible goals, and see which we care about.

Here are some goals I think we probably care about, but I could be wrong:

We should return a number uniformly at random in the range [0, 1.0). (That is, for all "suitable" x<y in [0,1.0), we should return a value in [x,y] with probability very close to y-x. Defining "suitable" and "very close" will be important, and might not include every possible double.)
Return outputs with at least some minimum granularity. (i.e, for some granularity delta, if x is a possible output, and x ± delta is in [0.0, 1.0), then there exists a possible output between x and x ± delta other than x.)
Run with reasonable efficiency.
Run in constant time.
Use the whole mantissa, or almost the whole mantissa.
Provide at least some number of bits of entropy in the output.
Work at least to a minimal degree on all c99 platforms.

Here are some goals I think we do not care about, but I could be wrong:

Work perfectly on systems where FLT_RADIX is not 2.
Provide identical output on all architectures regardless of floating-point implementation.
Return every possible output with some probability. (For example, values less than 1e-300 are possible doubles. But they have cumulative probability of 1e-300, which is less likely than just guessing the RNG seed on the first try.)
Possibly return subnormal values.
Perfect behavior on corner cases with total probability less than some epsilon (maybe 2^-96)?
Run as fast as possible.

I think one way we could choose between our goals is to look at how the function is used (and could be used for privcount-in-tor's guassians, and could be used in other places where we synthesise a random double using a similar method).

For example, if we used the naïve algorithm that divides [0, UINT64_MAX] by (UINT64_MAX+1), I think we get a pattern like: 0, 1/2^64^, 2/2^64^, 3/2^64^, ... , 2^53^/2^64^, 2^53^/2^64^, (2^53^+2)/2^64^, ... , (2^64^ - 2^11^ - 2^10^)/2^64^ (~2^11^ times), 1.0 (~2^10^ times) due to representation limits (the details would vary depending on the rounding mode and possibly the hardware).

I wonder if this satisfies the requirements for our random noise distributions (which is where we mainly use this function) after being passed through the laplace and guassian transforms.

We should document their range and granularity in a similar level of detail, too.

Replying to nickm:

Here are some goals I think we probably care about, but I could be wrong:

If that's what you want:

double uint64_to_dbl_0_1(uint64_t x) {
  /* Can't merely check for __STDC_IEC_559__ because not every compiler we care about defines it. */
#if FLT_RADIX != 2
#error FLT_RADIX != 2, your system is the most special of them all.
#endif
  x >>= (sizeof(double) * CHAR_BIT) - DBL_MANT_DIG;
  return (DBL_EPSILON/2) * x;
}

We should return a number uniformly at random in the range [0, 1.0). (That is, for all "suitable" x<y in [0,1.0), we should return a value in [x,y] with probability very close to y-x. Defining "suitable" and "very close" will be important, and might not include every possible double.)

Check.

Return outputs with at least some minimum granularity. (i.e, for some granularity delta, if x is a possible output, and x ± delta is in [0.0, 1.0), then there exists a possible output between x and x ± delta other than x.)

Check.

Run with reasonable efficiency.

Check.

Run in constant time.

Check.

Use the whole mantissa, or almost the whole mantissa.

Check.

Provide at least some number of bits of entropy in the output.

Check.

Work at least to a minimal degree on all c99 platforms.

If people want to run tor on something that is exotic to the point where this sort of approach breaks, they can send patches.

Yes this still leaves out "possible" values, but it trivially accomplishes uniform, fast, and constant time.

Edit: Left out a }.

That actually does look fairly good. Let's review that approach.

Trac:
Status: needs_revision to needs_review

crypto_rand_double() should produce all possible outputs on platforms with 32-bit int

Child items ...

Activity