Changes

Alexander Hansen Færøy · 53a16e3a
--- a/doc/TorRelaySecurity.md
+++ b/doc/TorRelaySecurity.md
+[[PageOutline]]
+
+= Tor Relay Security and Best Practices =
+
+This document aims to motivate and describe some best practices for Tor Relay security, grounded in a realistic threat model.
+
+This document is written only from the point of view of protecting your Tor Relay from that threat model. It assumes that Tor is the most important process on the machine, and that protecting other things from Tor is therefore out of scope. For general information on securing your machine against attacks through the Tor daemon itself, see the [wiki:doc/OperationalSecurity Operational Security] page.
+
+For information on running an exit relay, see [https://blog.torproject.org/blog/tips-running-exit-node-minimal-harassment Tips for Running an Exit Node with Minimal Harassment] as well as the set of [wiki:doc/TorAbuseTemplates Tor Abuse Templates].
+
+As of Tor 0.2.7 we use ed25519 identities for relays with an optional feature to generate and keep the ed25519 master identity secret key offline. The relay will use a temporary signing key with limited lifetime, so it will require periodic renewals for the temporary signing key. This will ensure the relay keeps the same identity regardless what happens to the temporary signing keys. [https://trac.torproject.org/projects/tor/wiki/doc/TorRelaySecurity/OfflineKeys Read this guide to learn more about offline relay identity keys].
+
+= Adversary Goals and Threat Model = 
+
+There is a significant difference between adversaries that can see inside of router-to-router TLS vs those that cannot. I believe this
+capability distinction governs the adversary goals in terms of compromising relays as opposed to merely externally observing them.
+
+Adversaries that can unwrap router TLS can perform every attack that an actual node can perform, at any location between the user and the node, and/or between the node and other nodes.
+
+In particular, adversaries that can see inside router TLS can perform [https://lists.torproject.org/pipermail/tor-dev/2012-March/003361.html tagging attacks] as well as perform circuit-specific active and passive timing analysis.
+
+These attacks can be quite severe. An adversary that is able to obtain Guard identity keys is free to perform a tagging attack anywhere on the Internet. In other words, if the adversary is interested in monitoring a particular user, the adversary need only obtain the identity keys for that user's 3 guard nodes, and from that point on, the adversary will be able to transparently monitor everything that user does by way of using tagging to bias the users paths to connect only to surveilled exit nodes who also have had their identity keys compromised.
+
+= Attack Vectors =
+
+There are two high-level vectors towards seeing inside node-to-node TLS
+(which uses ephemeral keys that are rotated daily and authenticated via
+the node's identity key). Both high-level vectors therefore revolve
+around node identity key theft.
+
+== Attack Vector !#1: One-Time Key Theft ==
+
+The one-time adversary is interested in performing a grab of keys and
+then operating transparently upstream afterwords. This adversary will
+take the form of a coercive request at a datacenter/ISP to extract
+identity node key material and from then on, operate externally as a
+transparent upstream MITM, creating fake ephemeral TLS keys
+authenticated with the stolen identity key. Tor nodes that encounter
+this adversary will likely see it in the form of unexplained
+reboots/mysterious downtime, which are inevitable in the lifespan of any
+Tor node.
+
+== Attack Vector !#2: Persistent Key Theft ==
+
+If one-time methods fail or are beyond reach, the adversary has to
+resort to persistent machine compromise to retain access to node key
+material.
+
+The Persistent attacker can use the same vector as !#1 or perhaps an external
+vector such as daemon compromise, but they then must also plant a
+backdoor that would do something like trawl through the RAM of a
+machine, sniff out the keys (perhaps even grabbing the ephemeral TLS
+keys directly), and transmit them offsite for collection.
+
+This is a significantly more expensive position for the adversary to
+maintain, because it is possible to notice upon a thorough forensic
+investigation during a perhaps unrelated incident, and it may trigger
+firewall warnings or other common least privilege defense alarms
+inadvertently.
+
+Unfortunately, it is also a more expensive attack to defend against,
+because it requires extensive auditing and assurance mechanisms on the
+part of the relay operator.
+
+= Defenses =
+
+It seems clear that the above indicates that at minimum relays should
+protect against one-time key compromise. Some further thought shows that
+it is possible to make the Persistent adversary's task harder as well, albeit
+with significantly more effort.
+
+Let's deal with defending against each vector in turn.
+
+== Vector !#1: Deploy Ephemeral Identity Keys ==
+
+The simplest way to defend against the adversary who attempts to extract
+relay keys through a reboot is to take advantage of the fact that even
+node identity keys can be ephemeral, and do not need to persist long
+term (certainly not past a reboot). This can be achieved with a boot
+script that wipes your keys (they live in /var/lib/tor/keys) at startup,
+or by [http://www.cyberciti.biz/faq/howto-create-linux-ram-disk-filesystem/ using a ramdisk].
+
+Periodically (on the order of every 12 to 18 months), you should completely
+wipe your node identity keys as a best practice and restart fresh even in the absence 
+of suspicious reboots.. The cost to the network of tossing node keys away is small.
+It only takes 2 weeks for your node to regain the Guard flag, for example.
+
+Additionally, ssh server key theft is another one-time vector that can
+be used to quickly bootstrap into node key theft. For this reason, node
+admins should always use ssh key auth for tor node administration
+accounts, since it [http://www.gremwell.com/ssh-mitm-public-key-authentication prevents ssh server key theft from allowing continuous server compromise].
+
+== Vector !#1: Offline Master Keys ==
+
+Even stronger than ephemeral identity keys are [https://trac.torproject.org/projects/tor/wiki/doc/TorRelaySecurity/OfflineKeys offline master keys] that are never exposed to the relay at all. This makes regular reputation resets (due to key resets) unnecessary and provides stronger assurances to the identity key when compared to keys that are deleted on reboot or every 12 to 18 months.
+
+== Vector !#2: Isolation Hardening and Readonly Runtime ==
+
+Once one-time key theft has been dealt with, you can begin to consider
+how to deal with the Persistent threat.
+
+The effort required to defend against this adversary is considerable,
+and it is not expected that all operators will devote the effort to do
+so.
+
+To limit scope, we are not going to deal with the daemon compromise
+vector; for that see your [wiki:doc/OperationalSecurity Operating System's least-privilege mechanisms]
+(such as SElinux, AppArmor, Grsec RBAC, Seatbelt, etc). 
+Instead, we will deal with how you can attempt to protect your identity keys once an adversary
+already has root access.
+
+=== Disabling the ptrace syscall ===
+
+If you are serious about defending against this adversary, the first
+thing you will want to do is disable access to the 'ptrace' system call
+from userland, which allows easy Tor key theft using debugging tools such 
+as gdb. Note that all currently deployed mechanisms to do this still allow root
+users to use ptrace on arbitrary processes. In order to disable ptrace
+for root users, you need to load a 
+[https://gist.github.com/1216637 kernel module to delete the ptrace call from the syscall table].
+
+Once access to the ptrace system call is removed, you need to disable
+module loading to prevent it from being restored. On Linux, this is
+accomplished via 'sysctl kernel.modules_disabled=1'. You should perform
+this operation as early in the boot process as possible. One technique
+that works on Redhat-based systems is to place a shell script in
+/etc/rc.modules to load the modules you need for operation, insert the
+ptrace module, and then issue the sysctl to disable further module
+loading. Redhat-derivatives launch /etc/rc.modules first thing at the
+top of /etc/rc.sysinit.
+
+=== Ensuring Runtime Integrity ===
+
+After that comes ensuring runtime integrity. There are several ways to
+achieve this, but most are easily subverted by an attacker with direct
+access to the hardware. The most robust approach seems to be to create a
+small encrypted loopback filesystem that contains all of the libraries
+required to run the 'tor' process as well as all of the requisite
+configuration files. This wiki page has several scripts attached to aid in
+collecting these files.
+
+The root filesystem itself doesn't need to be more than ~25M in size,
+but you will also need an auxiliary var loopback that needs to be a
+hundred megs or so. You should only have to authenticate and update
+the root filesystem, not the var filesystem, but both should be
+encrypted, since node keys are stored in var.
+
+Here are the commands for creating the root loopback filesystem:
+
+{{{
+ dd if=/dev/urandom of=./tor-root.img bs=1k count=25k
+ losetup /dev/loop1 ./tor-root.img
+ cryptsetup luksFormat /dev/loop1
+ cryptsetup luksOpen /dev/loop1 tor-root
+ mkfs.ext4 /dev/mapper/tor-root
+}}}
+
+When you use this loopback, you will mount it readonly, and mount an
+unencrypted var directory inside of it, and a ramdisk for your keys
+inside of that. For now, you'll leave it readwrite for setup.
+
+{{{
+ dd if=/dev/urandom of=./tor-var.img bs=1k count=200k
+ losetup /dev/loop2 ./tor-var.img
+ cryptsetup luksFormat /dev/loop2
+ cryptsetup luksOpen /dev/loop2 tor-var
+ mkfs.ext4 /dev/mapper/tor-var
+
+ mkdir /mnt/tor-root
+ mount /dev/mapper/tor-root /mnt/tor-root 
+ mkdir /mnt/tor-root/var
+ mount /dev/mapper/tor-var /mnt/tor-root/var
+}}}
+
+=== Volume Setup and Authentication ===
+
+Once you've got your volumes set up, you can then run the scripts attached
+to this wiki page to copy your known-good Tor runtime into the volume.
+
+{{{
+ XXX: demo script use
+}}}
+
+The attached scripts have been tested to work on RHEL/CentOS and Ubuntu
+systems, and may work on Fedora and Debian systems as well.
+
+Don't forget to periodically update the libraries stored on your
+loopback root using a trusted offsite source, as they won't receive
+security updates from your distribution. You want to avoid using static
+tor binaries, as they also suffer from the update problem, and additionally
+do not recieve the benefit of per-library [https://en.wikipedia.org/wiki/Address_space_layout_randomization ASLR].
+
+
+=== Identity Key Management ===
+
+Once you start your tor process(es), you will want to copy your identity
+key offsite, and then remove it. Tor does not need it to remain on disk
+after startup, and removing it ensures that an attacker must deploy a
+kernel exploit to obtain it from memory. While you should not re-use the
+identity key after unexplained reboots, you may want to retain a copy
+for planned reboots and tor maintenance.
+
+{{{
+ scp /mnt/tor-root/var/lib/tor/keys/secret_id_key offsite_backup:/mnt/usb/tor_key
+ rm /mnt/tor-root/var/lib/tor/keys/secret_id_key
+}}}
+
+Upon suspicious reboots, you can verify the integrity of your tor image
+by simply calculating the sha1sum (perhaps copying the image offsite
+first). You do not need to do anything special with the var loopback.
+
+These steps should prevent even adversaries who compromise the root
+account on your system (by rebooting it, for example) from obtaining
+your identity keys directly, forcing them to resort to kernel exploits
+and memory gymnastics in order to do so.
+
+=== Auditing the Kernel and Boot Scripts ===
+
+After suspicious reboots, you should audit your initrd, kernel image, modules, and init scripts
+as best you can.
\ No newline at end of file