Use faster AES_CTR, SHA1 implementations?
If we can make these go any faster than they do now, we'll see it in our profiles; they are two of the biggest chunks of our usage.
They're already optimized and using assembly, but the numbers below show that we could possibly do better.
Edited by Nick Mathewson