Skip to main content
Sign in
Snippets Groups Projects

Tor Browser Design Doc

Last edited by donuts

The Design and Implementation of Tor Browser [DRAFT]

Previous version:

Mike Perry

<mikeperry@torproject.org>

Erinn Clark

<erinn@torproject.org>

Steven Murdoch

<sjmurdoch@torproject.org>

Georg Koppen

<gk@torproject.org>

Richard Pospesel

<richard@torproject.org>

Morgan

<morgan@torproject.org>

August 12, 2024


Table of Contents

  1. Introduction

    1.1 Tor Browser Component Overview

  2. Design Requirements and Philosophy

    2.1 Security Requirements

    2.2 Privacy Requirements

    2.3 Philosophy

  3. Adversary Model

    3.1 Adversary Goals

    3.2 Adversary Positioning

    3.3 Adversary Attacks

    3.4 Limitations

  4. Implementation

  1. Build Security and Package Integrity

    5.1 Achieving Binary Reproducibility

    5.2 Package Signatures and Verification

    5.3 Anonymous Verification

    5.4 Update Safety

1. Introduction

This document describes the design requirements, adversary model , and implementation of a browser which defends against active network adversaries and passive local forensic adversaries.

For more practical information regarding Tor Browser development, please consult the Application's Team Wiki.

1.1 Tor Browser Component Overview

The browser is based on Mozilla's Extended Support Release (ESR) Firefox branch. We maintain a series of patches atop ESR Firefox which:

  • Backport surgical privacy features, security fixes and bug fixes from Mozilla's Rapid Release (RR) Firefox branch
  • Implement non-Tor related privacy and security features
  • Integrate Tor Network connectivity into the browser
  • Implement Tor-specific privacy and security features

To provide network anonymity, we integrate the legacy Tor daemon (aka little-t tor or c-tor) into the browser and drive all network communications through the daemon's SOCKS5 proxy functionality.

To provide censorship circumvention in areas where the public Tor network is blocked either by IP or by protocol fingerprint, we include several pluggable transports in the distribution. For the an up-to-date list of the currently included pluggable transports, please refer to the tor-expert-bundle project in tor-browser-build.

2. Design Requirements and Philosophy

These browser design requirements are meant to describe the properties of a Private Browsing Mode that defends against both network and local forensic adversaries.

There are two main categories of requirements: Security Requirements, and Privacy Requirements.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

2.1 Security Requirements

The security requirements are primarily concerned with ensuring the safe use of Tor. Violations in these properties typically result in serious risk for the user in terms of immediate deanonymization and/or observability.

  1. Proxy Obedience

    Prior to connecting to the Tor Network, the browser MUST NOT bypass Tor proxy settings for any content. User consent is REQUIRED when the browser needs to access remote services to facilitate connecting to the Tor Network (for example, to acquire bridges from the rdsys service).

  2. State Separation

    The browser MUST NOT provide the content window with any state from any other browsers or any non-Tor browsing modes. This includes shared state from independent plugins, and shared state from operating system implementations of TLS and other support libraries.

  3. Disk Avoidance

    By default, the browser and any built-in extensions MUST NOT write any information that is derived from or that reveals browsing activity to the disk, or store it in memory beyond the duration of one browsing session. This requirement MAY be ignored if the user has explicitly opted to store their browsing history information to disk.

  4. Least Privilege

    The browser MUST NOT run with permissions or capabilities it does not need to function.

2.2 Privacy Requirements

The privacy requirements are primarily concerned with reducing linkability: the ability for a user's activity on one site to be linked with their activity on another site without their knowledge or explicit consent.

For the purposes of the unlinkability requirements of this section as well as the descriptions in the implementation section, a URL bar origin means at least the second-level DNS name. For example, for mail.google.com, the origin would be google.com.

  1. Cross-Origin Identifier Unlinkability

    User activity on one URL bar origin MUST NOT be linkable to their activity in any other URL bar origin by any third party automatically or without user interaction or approval. This requirement specifically applies to linkability from stored browser identifiers, authentication tokens, and shared state. The requirement does not apply to linkable information the user manually submits to sites, or due to information submitted during manual link traversal. This functionality SHOULD NOT interfere with interactive, click-driven federated login in a substantial way.

  2. Cross-Origin Fingerprinting Unlinkability

    User activity on one URL bar origin MUST NOT be linkable to their activity in any other URL bar origin by any third party. This property specifically applies to linkability from fingerprinting browser behavior.

  3. Long-Term Unlinkability

    The browser MUST provide an obvious, easy way for the user to remove all of its authentication tokens and browser state and obtain a fresh identity. Additionally, the browser SHOULD clear linkable state by default automatically upon browser restart, except at user option.

2.3 Philosophy

In addition to the above design requirements, the technology decisions about the browser are also guided by some philosophical positions about technology.

  1. Preserve existing user model

    The existing way that the user expects to use a browser must be preserved. If the user has to maintain a different mental model of how the sites they are using behave depending on tab, browser state, or anything else that would not normally be what they experience in their default browser, the user will inevitably be confused. They will make mistakes and reduce their privacy as a result. Worse, they may just stop using the browser, assuming it is broken.

  2. Favor the implementation mechanism least likely to break sites

    In general, we try to find solutions to privacy issues that will not induce site breakage, though this is not always possible.

  3. Minimize global privacy-affecting features and settings

    User customizable settings which measurably alters browser behavior can be used by adversaries as a fingerprinting tool. Therefore the such settings SHOULD NOT be made available to users. However, browser features which affect accessibility MUST be configurable by the user.

    Browser features which affect privacy and are not required for general web-browsing but are required for particular website functionality should be allowed with explicit user-consent on a per first-party domain basis to eliminate the possibility of linkability between domains. Such functionality includes webcam and microphone access, geolocation, and canvas read-access. Built-in extensions MUST follow these same guidelines.

    Browser or extension features which affect privacy MUST be kept in lock-step with the browser version maximize user privacy. Out-of-band updates affecting such functionality MUST not be enabled.

  4. Distrust closed-source and proprietary APIs

    Proprietary operating systems provide APIs which are closed-source and cannot be audited. The browser MUST minimize usage of such OS-provided functionality where possible. Any functionality which uses such APIs but is not required for core browser functionality MUST be disabled.

  5. Stay Current

    We believe that if we do not stay current with the support of new web technologies, we cannot hope to substantially influence or be involved in their proper deployment or privacy realization. However, we will likely disable high-risk features pending analysis, audit, and mitigation.

3. Adversary Model

The browser's adversaries have a number of possible goals, capabilities, and attack types that can be used to illustrate the design requirements for the browser.

3.1 Adversary Goals

  1. User identification

    The adversary's primary goal is to de-anonymise the user by compromising or bypassing Tor, causing the user to directly connect to an IP of the adversary's choosing.

  2. Correlation of Tor vs non-Tor activities

    If direct proxy bypass is not possible, the adversary will likely happily settle for the ability to correlate something a user did via Tor with their non-Tor activity. Sometimes, the fact that a user uses Tor may be enough for some authorities.

  3. History disclosure

    The adversary may also be interested in a user's browsing history. They may wish to determine whether if and when a user has visited a particular site. They may wish to learn search queries or the other contents of a user's browsing session.

  4. Correlation of activity across multiple site or services

    The adversary may want to correlate user identities or sessions across multiple remote services. For instance, advertising networks may wish to know that a user who visited site-x.com is the same user that visited site-y.com to serve them targeted ads while law-enforcement may wish to associate anonymous activity on site-b.com with a known identity on size-a.com to build a criminal case.

  5. Censorship

    The adversary may wish to block access to particular websites or to the entire Tor Network.

3.2 Adversary Positioning

Adversaries may position themselves at a number of possible locations in order to execute their attacks.

  1. 1st party websites

    Adversaries may run websites, either on the clearnet (requiring access via an Exit relay) or as an Onion Service within the Tor Network.

  2. 3rd party services

    Adversaries may host and serve content intended to be embedded in other 1st party websites, either on the clearnet or as an Onion Service within the Tor Network. This content includes things such as scripts, images, video, fonts, etc which may downloaded and run by the browser.

  3. Exit relays or upstream routers

    Adversaries may run Tor exit relays or they may control routers upstream of exit relays. They may observe and modify the contents and destination of traffic exiting from and returning to the Tor Network.

  4. Middle relays or upstream routers

    Adversaries run Tor middle relays or they may control routers upstream of middle relays. They may observe metadata around the connections to their peers.

  5. Guard relays or upstream routers

    Adversaries may run Tor guard nodes or they may control routers upstream of guard nodes. They may observe metadata around the connections to the user and their circuit's middle relays. They also know the user's public IP address.

  6. Local network, ISP, or upstream routers

    Adversaries may also inject malicious content at the user's upstream router when they have Tor disabled, in an attempt to correlate their Tor and non-Tor activity.

    Additionally, at this position the adversary may block Tor, or attempt to recognize the traffic patterns of specific web pages at the entrance to the Tor network.

  7. Physical or remote access

    Adversaries may have intermittent or constant access to a target's computer hardware. Such adversaries would include law enforcement, system administrators, other users of a shared system, or domestic partners. Adversaries may also be able to compel targets to surrender their passwords or encryption keys.

    We assume these adversaries do not have the ability to run arbitrary code on the target's computer during a browsing session. Rather, we assume only passive forensic access after browsing has taken place.

  8. Release infrastructure

    Adversaries may have access to release infrastructure such as build servers, source code repositories, or developer computers. They may attempt to modify the contents of files or communications on the affected machines.

3.3 Adversary Attacks

The adversary can perform the following attacks from a number of possible positions or combinations of positions to accomplish various aspects of their goals.

  1. Read and write identifiers

    • Positioning
      • 1st party websites
      • 3rd party services
      • Exit relays or upstream routers

    The browser contains multiple facilities for storing identifiers that the adversary creates for the purposes of tracking users. These identifiers are most obviously cookies, but also include HTTP auth, DOM storage, cached scripts and other elements with embedded identifiers, client certificates, and even TLS Session IDs.

    An adversary in a position to perform machine-in-the-middle content alteration can inject document content elements to both read and inject cookies for arbitrary domains. Such an adversary may also steal or alter document content.

  2. Fingerprint browser properties

    • Positioning
      • 1st party websites
      • 3rd party services
      • Exit relays or upstream routers

    By default, modern web browsers expose quite a large number of stable properties about the user's operating system, physical hardware, customisations, and personal information.

    In isolation, most of these properties are typically not sufficient to uniquely identify and thus track a user across domains or deanonymise them. However, such properties can be bucketed and combined to generate a stable identifier which can be used to track users across colluding 1st and 3rd party domains.

    Some examples of fingerprintable features available to adversaries in modern browsers (not an exhaustive list):

    • Operating system version
    • System fonts
    • Device CPU
    • Screen size
    • Installed web-extensions
    • Accessibility customisations
    • User's timezone
    • User's preferred locale

    Despite the apparent diversity of properties available to adversaries, each one individually is ultimately derived through one of these attack vectors:

    • Observing request behaviour

      Properties of the user's request behavior comprise the bulk of low-hanging fingerprinting targets. These include: User agent, Accept-* headers, pipeline usage, and request ordering. Additionally, the use of custom filters such as ad-blockers and other privacy filters can be used to fingerprint request patterns.

    • JavaScript

      JavaScript can reveal a lot of fingerprinting information. It provides DOM objects such as window.screen and window.navigator to extract information about the user agent. Also, JavaScript can be used to query the user's timezone via the Date() object, WebGL can reveal information about the video card in use, and high precision timing information can be used to fingerprint the CPU and interpreter speed. JavaScript features such as Resource Timing may leak an unknown amount of network timing related information. And, moreover, JavaScript is able to extract available fonts on a device with high precision.

    • CSS media queries

      CSS media queries can be inserted to gather information about the desktop size, widget size, display type, DPI, user agent type, and other information that was formerly available only to JavaScript.

  3. Fingerprint network traffic

    • Positioning
      • Guard relays or upstream routers
      • Local network, ISP, or upstream routers

    Network traffic fingerprinting is an attempt by the adversary to recognize the encrypted traffic patterns of specific websites. In the case of Tor, this attack would take place between the user and the guard relay, or at the guard relay itself.

  4. Exploit the browser or operating system

    • Positioning
      • 1st party websites
      • 3rd party websites
      • Exit relays or upstream routers
      • Guard relays or upstream routers
      • Release infrastructure

    An adversary may develop exploit chains targeting vulnerabilities in the browser or the operating system to install malware and surveillance software.

    For example, an adversary running a website may serve users with JavaScript capable of breaking out of the browser's sandbox. They could also serve specially crafted files (such as images or documents) which exploit bugs in parser or rendering implementations found on user's systems. Adversaries running exit relays may inject such exploits into unencrypted data streams, while adversaries running guard relays may target the tor daemon itself using specially crafted messages which take advantage of undefined behaviour granting arbitrary code-execution.

    An adversary may also target release infrastructure to potentially compromise browser releases themselves.

    For example, an adversary may compromise the source code of a library which the browser depends on, resulting in malware being built and shipped in official browser releases. An adversary may compromise build or release infrastructure resulting in back-doors being inserted into official browser releases. An adversary may compromise update servers, allowing them to ship compromised browser updates to users. An adversary may infiltrate the project itself and apply their own malicious patches during the browser release process.

  5. Read the local disk

    • Positioning
      • Physical or remote access

    Adversaries with access to a user's machine may try to learn a user's browsing history by inspecting persisted artifacts stored on disk.

    Such artifacts could include:

    • Browser history
    • Cookies
    • Per-site permissions
    • Site exceptions
    • Saved authentication credentials
    • Cached data
    • System logs
    • Recent files lists

3.4 Limitations

  1. Application data isolation

    In the past, we have made application data isolation an explicit goal, whereby all evidence of the existence of Tor Browser usage can be removed via secure deletion of the installation folder. This is not generally achievable.

    To hypothetically solve this problem in the general case, we would need to modify the browser to either work around any data-leaking external API calls or implement cleanup functionality for each platform to wipe the offending data from disk. Some of this cleanup would necessarily require elevated privileges (e.g. Admin or root) to cleanup leaks made by the operating system itself, which goes against our principle of least privilege. We would also need continual audits to identify all of the conditions under which the user's operating itself leaks information about their browsing session for each supported operating system and CPU architecture.

    Practically speaking, it is not possible to provide this functionality with a level of confidence required for cases where physical access is a concern. The majority of deployed Tor Browser installs run on platforms which either explicitly disrespect user agency and privacy (for-profit platforms such as Android, macOS, and Windows) or whose threat model may be less extreme than that of some of our users (the various flavours of Linux and BSD).

    Users whose threat model does include the need to hide evidence of their usage of Tor Browser should use Tor Browser with the Tails operating system. Tails is a purpose-built Linux-based operating system which is ephemeral by default, and also supports full-disk encryption for optional persistent storage if needed. It essentially provides whole operating system level data isolation to its users with a level of confidence unachievable for Tor Browser on its own.

  2. Arbitrary code execution

    In the general case, we must also presume the adversary does not have the ability to run arbitrary code outside of the browser's sandbox. That is to say, we presume the user's system has not been exploited and is free of malware, keyloggers, rootkits, etc. For the purposes of our adversary model, we presume that user's operating system is not compromised or otherwise working against the user's own interests.

    This assumption is most likely not true in the general case, particularly in the case of the aforementioned for-profit platforms or for computers which the user shares with others. However, the browser is ultimately just another process running with limited privileges within a larger ecosystem which it has no control over. We are therefore unable to make promises about the browser's capabilities or protections in such environments.

    We would again direct users whose threat model necessitates being unable trust their computer to use the Tails operating system.

4. Implementation

TODO: Re-write this section based on the current Tor Browser implementation. Each subsection should include mitigations provided by:

  • preference
  • build-flag
  • tor-browser patch

5. Build Security and Package Integrity

TODO Re-write this section with up-to-date information about the current build-system, annoyances around code-signing requirements for verifying reproducibility, etc. We can probably outsource the background on reproducible builds elsewhere.

5.1 Achieving Binary Reproducibility

TODO No longer using gitian, use this section for rbm/tor-browser-build

5.2 Package Signatures and Verification

TODO review/edit

The build process generates a single sha256sums-unsigned-build.txt file that contains a sorted list of the SHA-256 hashes of every package produced for that build version. Each official builder uploads this file and a GPG signature of it to a directory on a Tor Project's web server. The build scripts have an optional matching step that downloads these signatures, verifies them, and ensures that the local builds match this file.

When builds are published officially, the single sha256sums-unsigned-build.txt file is accompanied by a detached GPG signature from each official builder that produced a matching build. The packages are additionally signed with detached GPG signatures from an official signing key.

The fact that the entire set of packages for a given version can be authenticated by a single hash of the sha256sums-unsigned-build.txt file will also allow us to create a number of auxiliary authentication mechanisms for our packages, beyond just trusting a single offline build machine and a single cryptographic key's integrity. Interesting examples include providing multiple independent cryptographic signatures for packages, listing the package hashes in the Tor consensus, and encoding the package hashes in the Bitcoin blockchain.

The Windows releases are also signed by a hardware token provided by Digicert. In order to verify package integrity, the signature must be stripped off using the osslsigncode tool, as described on the Signature Verification page.

5.3 Anonymous Verification

TODO Update this section, detail our current build verification process, probably rename the section to 'Build Verification'

5.4 Update Safety

TODO Verify this is accurate and update with any new changes

We make use of the Firefox updater in order to provide automatic updates to users. We make use of certificate pinning to ensure that update checks cannot be tampered with by setting security.cert_pinning.enforcement_level to 2, and we sign the individual MAR update files with keys that get rotated every year.

The Firefox updater also has code to ensure that it can reliably access the update server to prevent availability attacks, and complains to the user after 48 hours go by without a successful response from the server. Additionally, we use Tor's SOCKS username and password isolation to ensure that every new request to the updater (provided the former got issued more than 10 minutes ago) traverses a separate circuit, to avoid holdback attacks by exit nodes.