Rust implementation of Equi-X and dependencies
Summary
This issue covers a pure Rust re-implementation of the algorithms used in proposal 327 proof-of-work DoS mitigation for onion services, as an alternative to wrapping the existing C implementation.
What is the expected behavior?
Arti should be able to use the proof-of-work DoS mitigations too, and this should be as unobtrusive as possible for users, developers, and integrators.
The code generation feature in HashX should be supported but optional. When this feature is disabled, the remainder of the implementation should be safe Rust code.
Implementation Notes
These algorithms form a layered stack; see section 2.2
of proposal 327 for more, but I'll go over the important layers for this issue quickly:
Equi-X is an asymmetric client puzzle that forms the main external module within tor's hs_pow implementation. Equi-X is a solver for a particular modification of Equihash. There's a lot of text here, but the core of the algorithm is a simple hash table, and tevador's implementation weighs in at only about 600 lines. This layer would be the most straightforward to reimplement in Rust.
Under Equi-X is HashX, which is the hash function that's intended to be easy to compute on a CPU but somewhat harder for GPUs and much harder for ASICs. This layer has the most currently-undocumented complexity in it, which we'd also be documenting in the process of a Rust reimplementation.
There is currently no specification for HashX, we'd be doing the job of documenting it by writing an independent implementation. Nearly all the complexity in HashX is in mapping a pseudorandom number stream (from siphash) to a small "program" (consisting of register ALU operations and up to one taken branch). A new implementation would be an opportunity to condense as much of this as possible into a tabular description that could also serve as a specification.
There are some integration challenges both with this approach and with an approach that simply wraps the C library.
On the Rust side, HashX uses both siphash and blake2 and it would be nice to depend on an external crate (even one we're already using). On the blake2 side I think this is possible so long as we jettison a HashX feature we weren't using anyway (block mode). On the siphash side, HashX uses siphash to generate the pseudorandom number stream that drives program generation, but it's not quite siphash proper. The state initialization is done a little differently, directly from the output of blake2 (four random words) instead of the typical way it's initialized using two words. HashX also uses one internal round from siphash as part of the finalization stage where the register values from the "program" are converted into the output of the hash function. This is all to say we'd either need to fork a siphash crate and add some features or just include a whole copy inside our hashx crate.
On the C library side, there are several features that are customizable as compile-time variables. This means that there's no single ABI for the library, and none of those settings could be modified by typical application code if the library were to be actually packaged and distributed in any way. For a Rust wrapper to be practical, these settings would need to be hidden, and the (non-default) values needed by Equi-X would need to be locked in.
In a Rust re-implementation it would be natural to solve these same problems by writing one generic implementation without the need for compile time macros.
License
For the C implementation of proposal 327, we added an optional GPL build mode in order to straightforwardly build with tevador's LGPL'ed equix and hashx libraries.
My understanding is that in the above plan to construct a Rust crate implementing the same algorithm in a compatible way but with somewhat different approaches and construction, we wouldn't technically be creating a derivative work as far as the GPL is concerned. But of course I'm not a lawyer, and I'm also interested in respecting tevador's wishes here since this is their algorithm and their contribution really.