Parallel Crypto: Design a good crypto parallelization plan and architecture
Lest we repeat design mistakes of cryptography programs past ("Ready! Fire! Aim!"), we should really come up with a good design for how to [legacy/trac#1749 (moved) split our crypto across multiple CPUs] before we get too deeply involved in the coding.
This should at the very least include figuring out what new data structures we need, what new code we need, what runs in subthreads, and how to minimize the amount of calls from subthreads back to the main thread.
(It's okay for this to be threaded and not multiprocess, BTW: Everybody who matters besides OpenBSD has kernel threads nowadays.)