Investigate building ed25519-donna with SSE2 support.

Followup to legacy/trac#9663 (moved)/legacy/trac#16467 (moved), now that we include ed25519-donna, we should look into building it with SSE2 support where appropriate.

Adding something like this to ed25519-donna-portable.h should do the trick:

/* Tor: Build with SSE2 where it makes sense to do so. */
#if defined(__SSE2__) && !(defined(__x86_64__) || defined(__amd64__)) 
#define ED25519_SSE2
#endif

Potential pitfalls:

  • SSE2 builds benchmark worse on x86_64, at least Haswell.
  • This opens us up to potentially really obnoxious compiler bugs/pathologically bad code generation.
  • Most distribution packages probably don't build for an architecture that will define _SSE2_, so this would only get picked up by people doing custom builds.

Open questions:

  • Is this actually faster on 32 bit Intel?
  • Is doing something like "always building SSE2 and non-SSE2 and using CPUID to pick at runtime" sensible? (I personally think "No").