Split relay and link crypto across multiple CPU cores

Right now, Tor does nearly all of its work in one main thread. We have a basic "CPUWorker" implementation that we use for doing server-side onionskin crypto in a separate thread, but thanks to improvements long ago, server-side onionskin crypto on longer dominates. If we could split the work of relay AES-CTR crypto and SSL crypto across multiple threads, that would be pretty helpful in letting high-performance servers saturate their connections. (Blutmagie has wanted this for some while.)

Child Tickets: [[TicketQuery(parent=legacy/trac#1749 (moved))]]

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information