Commit bbcbf479 authored by Nick Mathewson's avatar Nick Mathewson 🌻
Browse files

Fresh git repository for work on "arti"

Arti is a rust tor implementation.  It's project I've been working
on for a few months now, in weekends and in spare time.  It doesn't
speak the tor protocol yet, and it doesn't connect to the network at

It needs much more documentation and testing, but I'm just about
ready to show it to others.  See the for a description of
what is there and what isn't.
The Tor Project is committed to fostering a inclusive community
where people feel safe to engage, share their points of view, and
participate. For the latest version of our Code of Conduct, please
Though this is not an official Tor software project, it is
nonetheless goverened by the Code of Conduct above.
members = [
\ No newline at end of file
# arti: A Rust Tor Implementation
(I'm choosing this name for temporary purposes, but with a nod
to the fact that temporary names sometimes stick. Its name is
a reference to Tor's semi-acronymic origins as "the onion
router", and to the Latin dative singular for "art". It is
also short for "artifact". It has nothing to do with the
"rt" issue tracker or the "rt" media outlet.)
## What's here and what isn't.
So far the code has untested or under-tested implementations of:
* the ntor protocol
* the relay crypto algorithm
* parsing and encoding all the cell types (except for hs-related
* parsing and validating ed25519 certificates
* parsing and validating router descriptors
Before I share it, I think it needs more work on:
* parsing the other kinds of network documents
* link authentication
* a sensible api for cell types
* a toy client that builds a circuit through the network
and uses it to exit.
There is no support yet for:
* Actually connecting to the network in a reasonable way
* choosing paths through the network in a reasonable way
* doing anything with the network in a reasonable way
* actually building circuits
* creating network documents
* v2 onion service anything
* v3 onion service anything
* the directory protocol
* lots of optimizations that Tor does
* working with no_std
I do not plan to implement full versions of any of those before I
share this code for more comment, though I might do a little. Who
## Caveat haxxor: what to watch out for
This is a work in progress. It doesn't "do Tor" yet, and what parts
of Tor it does "do" it probably doesn't do securely.
I'm learning Rust here as I go along. There are probably aspects of
the language or its ecosystem that I'm getting wrong.
Almost nothing about this code should be taken as "final" -- I
expect that if anybody wants to make this work for real purposes,
we'll need to refactor and move around a whole bunch of code, add a
bunch of APIs, split crates, merge crates, and so on.
There are some places where I am deviating from the existing
protocol under the assumption that certain proposals will be
accepted. I'll try to document those.
This code does not try to be indistinguishable from the current Tor
## Structure
To try to keep dependency relationships reasonable, and to follow
what I imagine to be best practice, I'm splitting this
implementation into a bunch of little crates within a workspace.
Crates that are tor-specific start with "tor-"; others don't.
I expect that the list of crates will have to be reorganized quite a
lot by the time we're done.
The current crates are:
: A utility for generating enumerations with helpful trait
: Wrappers and re-imports of cryptographic code that Tor needs in
various ways. Other crates should use this crate, and not actually
use any crypto crates directly
: Byte-by-byte encoder and decoder functions and traits. We use
this to safely parse cells, certs, and other byte-oriented things.
: Decoding and checking signatures on Tor's ed25519 certificates.
: Minimal implementation of the Tor subprotocol verisoning system.
Less complete than the one in Tor's current src/rust, but more
: Parsing for Tor's network documents. Currently only handles
routerdescs. Underdocumented and too big. needs splitting.
: Functions to work with cell types, handshakes, and other aspects
of the Tor protocol. Underdocumented, too big, needs
- Not done
. Partially done
o complete
X Won't do.
- Decisions
- Which protocols to support?
- How far up the stack to go?
- How speculative to get?
- Specs
- Test vectors
- Add test vectors for ntor
- Add test vectors for relay crypto
- Add test vectors for hs-ntor
- Add test vectors for hs-relay crypto
- Add test vectors for TAP
- Clarity
- END cell format
- Directory consistency
- "-----BEGIN" should not be a valid keyword
- Whitespace at start of line, y/n? Mixed whitespace, y/n? CR, y/n?
- UTF-8.
- Primitive crypto
- Wrap x25519 in a trait
- Use signature trait for ed25519.
- Add RSA-pkcs1 signature support
- Add RSA-pem encode/decode support
- RSA-oaep, if supported.
- test vectors for sha1
- test vectors for sha2
- test vectors for sha3/shake
- RSA test vectors as needed
- Higher level crypto
- Test vectors for hmac
- Test vectors for tap-kdf
- Test vectors for hkdf
- Test vectors for other kdfs
- Main Protocol functionality
o encode and decode regular cell types.
. handshakes
o ntor
. relay crypto
o implement
- tests
- Internals:
- Consider using a safer thing instead of current bytereader. Like the
one rustls has? Like "untrusted"?
- Consider using a writer trait that's agnostic about whether it's
writing into an expanding Vec or a fixed slice.
- Good API for "make this cell and encrypt it and write it"
- Good API for "take a cell out of a Reader" and stuff that comes after.
- Async variant of that API?
- Tests
- For all cell types
- for all relay cell types
- For all handshakes
- State for multiplexing circuits on a connection
- State for sending sendme cells, both versions.
- V1 sendmes
- State for managing streams
- Initial protocol handshake for client/relay authentication
- Initial protocol handshake for relay/relay authentication
- Directory parsing stuff
. Parsing backend
o Get tokens
- Match tor's actual token behavior?
o Remove extraneous hoohaw.
o Get a "parse into a vector of maybe-tokens" thing.
o Get a "validate that every must-token is there" thing.
- Macro for making a Keyword type.
- maybe, macro for making a sectionrules
- Parse descriptors
- Parse into a reasonable routerdesc object.
- Parse a pile of them.
- Check ed signatures on router descriptors
- Check rsa signatures on descriptors
- Check additional invariants?
- Parse consensus directories, both variants.
- Parse microdescs
- Apply consensus diffs
- Directory encoding stuff
- Encoding/signing backend
- Encode descriptors
- Additional small functionality, protocol level
- Relay padding
- HS functionality
- encode and decode hs cell types
- State as needed for hs lookup
- hs cell types
- hs directory stuff
- HSv3 directory obejcts, encode
- HSv3 directory objects, decode
- crypto variants
- hsv3 variant of relay crypto
- hsv3 variant of ntor
- tests and vectors for the above.
X Not currently planning to do:
X Link protocol v1 (multicert)
X Link protocol v2 (renegotiation)
- Unsure if planning to do:
- Link protocol v3 (short circuit IDs, PK comparison)
- Linkauth 1 (RSA-SHA256-TLSSecret)
- Parsing votes
- HSv2 directory support
- Supporting relays without ed25519 keys.
- Compute consensus diffs
- Waiting on RSA-OAEP:
- Handshakes
- HSv2 handshakes
rust itself:
* specialization for u8.
* generic arrays via associated consts or whatever it's called.
* when an enum is c-like, implement try_into for numbers. (but see num_enum and many similar crates.)
* stabilize traling_one()
* let me go from a montogmery point to a compressed edwards point.
* let me use the right rand trait kthx
* defer signature verification without prehash mode. A signature
future would be great.
* multiline support.
* much cheaper hash function
* Support for RSA key without OID
* should use multiline support or at least not copy when decoding
base64 multiline.
* should be stricter?
* oaep
* if not oaep, raw?
* get raw signed data to allow multiple signed formats.
* static curve25519
* key agreement trait
ed25519 trait:
* batch support.
* implement digest trait.
* more efficient bitwise operations.
name = "caret"
version = "0.0.0"
authors = ["Nick Mathewson <>"]
edition = "2018"
license = "MIT OR Apache-2.0"
publish = false
//! Crikey! Another Rust Enum Tool?
//! This set of macros combines ideas from several other crates that
//! build C-like enums together with appropriate conversion functions to
//! convert to and from associated integers and associated constants.
//! It's inspired by features from enum_repr, num_enum, primitive_enum,
//! enum_primitive, enum_from_str, enum_str, enum-utils-from-str, and
//! numeric-enum-macro. I'm not sure it will be useful to anybody but
//! me.
//! To use it, write something like:
//! ```
//! use caret::caret_enum;
//! caret_enum! {
//! pub enum EnumType as u16 {
//! Variant1,
//! Variant2,
//! Variant3,
//! }
//! }
//! ```
//! When you define an enum using `caret_enum!`, it automatically gains
//! conversion methods:
//! * to_int()
//! * to_str(),
//! * from_int(),
//! * from_string().
//! The macro will also implement several traits for you.
//! * From<EnumType> for u16,
//! * From<EnumType> for &str
//! * Display for Enumtype
//! * FromStr for EnumType
//! * TryFrom<u16> for EnumType
//! Finally, EnumType will have derived implementations for Eq,
//! PartialEq, Copy, and Clone, as you'd expect from a fancy alias for
//! u16.
//! If you specify some other integer type instead of `u16`, that type
//! will be used as a representation instead.
//! You can specify specific values for the enumerated elements:
//! ```
//! # use caret::*;
//! # caret_enum!{ pub enum Example as u8 {
//! Variant1 = 1,
//! Variant2 = 5,
//! Variant3 = 9,
//! # } }
//! ```
//! You can also override the string representation for enumerated elements:
//! ```
//! # use caret::*;
//! # caret_enum!{ pub enum Example as u8 {
//! Variant1 ("first"),
//! Variant2 ("second"),
//! Variant3 ("third") = 9,
//! # } }
//! ```
macro_rules! caret_enum {
$v:vis enum $name:ident as $numtype:ident {
$id:ident $( ( $as_str:literal ) )? $( = $num:literal )?
$( , )?
} => {
#[repr( $numtype )]
$v enum $name {
$( $( #[$item_meta] )* $id $( = $num )? , )*
impl $name {
pub fn to_int(self) -> $numtype {
match self {
$( $name::$id => $name::$id as $numtype , )*
pub fn to_str(self) -> &'static str {
match self {
$( $name::$id => $crate::caret_enum!(@impl string_for $id $($as_str)?) , )*
pub fn from_int(val: $numtype) -> Option<Self> {
$( const $id : $numtype = $name::$id as $numtype; )*
match val {
$( $id => Some($name::$id) , )*
_ => None
fn from_string(val: &str) -> Option<Self> {
match val {
$( $crate::caret_enum!(@impl string_for $id $($as_str)?) => Some($name::$id) , )*
_ => None
impl std::convert::From<$name> for $numtype {
fn from(val: $name) -> $numtype {
impl std::convert::From<$name> for &'static str {
fn from(val: $name) -> &'static str {
impl std::fmt::Display for $name {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{}", self.to_str())
impl std::convert::TryFrom<$numtype> for $name {
type Error = &'static str; // this is not the best error type XXXX
fn try_from(val: $numtype) -> std::result::Result<Self, Self::Error> {
$name::from_int(val).ok_or("Unrecognized value")
impl std::str::FromStr for $name {
type Err = &'static str; // this is not the best error type XXXX
fn from_str(s: &str) -> std::result::Result<Self, Self::Err> {
$name::from_string(s).ok_or("Unrecognized value")
// Internal helpers
[ @impl string_for $id:ident $str:literal ] => ( $str );
[ @impl string_for $id:ident ] => ( stringify!($id) );
use caret::caret_enum;
use std::convert::TryInto;
caret_enum! {
enum Demo as u16 {
A = 8,
B ("TheLetterB") = 10,
Dee = 999,
fn test_int_ops() {
assert_eq!(Demo::A.to_int(), 8u16);
assert_eq!(Demo::B.to_int(), 10);
assert_eq!(Demo::C.to_int(), 11);
assert_eq!(Demo::Dee.to_int(), 999);
let t: u16 = Demo::A.into();
assert_eq!(t, 8);
let t: Demo = 999.try_into().unwrap();
assert_eq!(t, Demo::Dee);
assert_eq!(Demo::from_int(11), Some(Demo::C));
assert_eq!(Demo::from_int(2), None);
let t: Result<Demo, _> = 6.try_into();
fn test_str_ops() {
assert_eq!(Demo::A.to_str(), "A");
assert_eq!(Demo::B.to_str(), "TheLetterB");
let t: &str = Demo::C.into();
assert_eq!(t, "C");
assert_eq!(format!("Hello {}", Demo::Dee), "Hello Dee");
let t: Demo = "TheLetterB".parse().unwrap();
assert_eq!(t, Demo::B);
let t: Result<Demo, _> = "XYZ".parse();
let t: Demo = "Dee".parse().unwrap();
assert_eq!(t, Demo::Dee);
let t: Result<Demo, _> = "Foo".parse();
be agile in use of other crates, but kinda flexible in main coding.
guess those mean the same.
instead, "make sure our dependencies on particular implementations of things
are not hard to replace", and "assume we will refactor or rewrite everything
we do."
what architecture am I aiming for? I'm hoping for a layer that is
no-network, that doesn't even have a "network" as a thing in it. That means
it's going to have to have lots of encoding/decoding stuff, and maybe the
crypto, and possibly it can handle a little state, but big state will be hard
Hm. This could actually help though. There are circuits and streams and
channels and they none of them need to actually "connect" until a different
layer. That would kinda rock, even.
name = "tor-bytes"
version = "0.0.0"
authors = ["Nick Mathewson <>"]
edition = "2018"
license = "MIT OR Apache-2.0"
publish = false
tor-llcrypto = { path="../tor-llcrypto" }
arrayref = "*"
# XXXX why did I have to downgrade?
generic-array = "0.12"
crypto-mac = "*"
thiserror = "*"
hex-literal = "*"
use thiserror::Error;
/// Error type for decoding Tor objects from bytes.
#[derive(Error, Debug, PartialEq, Eq)]
pub enum Error {
#[error("object truncated (or not fully present)")]
#[error("extra bytes at end of object")]
#[error("bad object: {0}")]
BadMessage(&'static str),
#[error("internal programming error")]
//! Implementations of Writeable and Readable for several items that
//! we use in Tor.
use super::*;
use generic_array::GenericArray;
// ----------------------------------------------------------------------
impl Writer for Vec<u8> {
fn write_all(&mut self, bytes: &[u8]) {
fn write_zeros(&mut self, n: usize) {
let new_len = self.len() + n;
self.resize(new_len, 0);
// ----------------------------------------------------------------------
impl<'a> Writeable for [u8] {
fn write_onto<B: Writer + ?Sized>(&self, b: &mut B) {
impl Writeable for Vec<u8> {
fn write_onto<B: Writer + ?Sized>(&self, b: &mut B) {
/* There is no specialization in Rust yet, or we would make an implementation
for this.
impl<N> Readable for GenericArray<u8, N>
N: generic_array::ArrayLength<u8>,
fn take_from(b: &mut Reader) -> Result<Self> {
// safety -- "take" returns the requested bytes or error.
impl<N> Writeable for GenericArray<u8, N>
N: generic_array::ArrayLength<u8>,
fn write_onto<B: Writer + ?Sized>(&self, b: &mut B) {
impl<T, N> Readable for GenericArray<T, N>
T: Readable + Clone,
N: generic_array::ArrayLength<T>,
fn take_from(b: &mut Reader<'_>) -> Result<Self> {
let mut v: Vec<T> = Vec::new();
for _ in 0..N