design.xml 176 KB
Newer Older
1
2
3
4
5
6
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN"
     "file:///usr/share/sgml/docbook/xml-dtd-4.4-1.0-30.1/docbookx.dtd">

<article id="design">
 <articleinfo>
Mike Perry's avatar
Mike Perry committed
7
  <title>The Design and Implementation of the Tor Browser [DRAFT]</title>
8
9
10
11
12
13
14
15
16
   <author>
    <firstname>Mike</firstname><surname>Perry</surname>
    <affiliation>
     <address><email>mikeperry#torproject org</email></address>
    </affiliation>
   </author>
   <author>
    <firstname>Erinn</firstname><surname>Clark</surname>
    <affiliation>
17
     <address><email>erinn#torproject org</email></address>
18
19
20
21
22
    </affiliation>
   </author>
   <author>
    <firstname>Steven</firstname><surname>Murdoch</surname>
    <affiliation>
23
     <address><email>sjmurdoch#torproject org</email></address>
24
25
    </affiliation>
   </author>
26
27
28
29
30
31
   <author>
    <firstname>Georg</firstname><surname>Koppen</surname>
    <affiliation>
     <address><email>gk#torproject org</email></address>
    </affiliation>
   </author>
32
   <pubdate>January 25th, 2018</pubdate>
33
34
35
36
37
38
 </articleinfo>

<sect1>
  <title>Introduction</title>
  <para>

39
This document describes the <link linkend="adversary">adversary model</link>,
Mike Perry's avatar
Mike Perry committed
40
<link linkend="DesignRequirements">design requirements</link>, and <link
Mike Perry's avatar
Mike Perry committed
41
linkend="Implementation">implementation</link> <!-- and <link
Mike Perry's avatar
Mike Perry committed
42
linkend="Packaging">packaging</link> and <link linkend="Testing">testing
43
procedures</link> --> of the Tor Browser. It is current as of Tor Browser
44
7.0.11.
45
46
47
48
49
50

  </para>
  <para>

This document is also meant to serve as a set of design requirements and to
describe a reference implementation of a Private Browsing Mode that defends
51
52
against active network adversaries, in addition to the passive forensic local
adversary currently addressed by the major browsers.
53
54

  </para>
55

56
57
58
59
60
61
62
63
64
  <para>

For more practical information regarding Tor Browser development, please
consult the <ulink
url="https://trac.torproject.org/projects/tor/wiki/doc/TorBrowser/Hacking">Tor
Browser Hacking Guide</ulink>.

  </para>

65
66
  <sect2 id="components">
   <title>Browser Component Overview</title>
67
68
   <para>

69
70
The Tor Browser is based on <ulink
url="https://www.mozilla.org/en-US/firefox/organizations/">Mozilla's Extended
71
72
73
74
Support Release (ESR) Firefox branch</ulink>. We have a <ulink
url="https://gitweb.torproject.org/tor-browser.git">series of patches</ulink>
against this browser to enhance privacy and security. Browser behavior is
additionally augmented through the <ulink
75
url="https://gitweb.torproject.org/torbutton.git/tree/">Torbutton
76
77
extension</ulink>, though we are in the process of moving this functionality
into direct Firefox patches. We also <ulink
78
url="https://gitweb.torproject.org/tor-browser.git/tree/browser/app/profile/000-tor-browser.js?h=tor-browser-52.5.2esr-7.0-2">change
79
a number of Firefox preferences</ulink> from their defaults.
80
81

   </para>
82
   <para>
83
84
85
86
Tor process management and configuration is accomplished through the <ulink
url="https://gitweb.torproject.org/tor-launcher.git">Tor Launcher</ulink>
addon, which provides the initial Tor configuration splash screen and
bootstrap progress bar. Tor Launcher is also compatible with Thunderbird,
87
Instantbird, and XULRunner.
88
89
90

   </para>
   <para>
91

92
93
To help protect against potential Tor Exit Node eavesdroppers, we include
<ulink url="https://www.eff.org/https-everywhere">HTTPS-Everywhere</ulink>. To
94
provide users with optional defense-in-depth against JavaScript and other
95
potential exploit vectors, we also include <ulink
96
url="https://noscript.net/">NoScript</ulink>. We also modify <ulink
97
url="https://gitweb.torproject.org/builders/tor-browser-bundle.git/tree/Bundle-Data/linux/Data/Browser/profile.default/preferences/extension-overrides.js">several
98
extension preferences</ulink> from their defaults.
99

100
   </para>
101
102
103
104
105
106
   <para>

To provide censorship circumvention in areas where the public Tor network is
blocked either by IP, or by protocol fingerprint, we include several <ulink
url="https://trac.torproject.org/projects/tor/wiki/doc/AChildsGardenOfPluggableTransports">Pluggable
Transports</ulink> in the distribution. As of this writing, we include <ulink
107
url="https://gitweb.torproject.org/pluggable-transports/obfs4.git">Obfs3proxy,
108
Obfs4proxy</ulink>,
109
110
<ulink
url="https://trac.torproject.org/projects/tor/wiki/doc/meek">meek</ulink>,
111
and <ulink url="https://fteproxy.org/">FTE</ulink>.
112
113
114

   </para>

115
116
117
118
119
120
121
122
123
124
  </sect2>
</sect1>

<!--
- Design overview and philosophy
  - Security requirements [Torbutton]
    + local leaks?
    - state issues
  - Privacy Requirements [Mostly blog post]
    - Avoid Cross-Domain Linkability
125
      - Identifiers
126
127
128
129
130
      - Fingerprinting
    - 100% self-contained
      - Does not share state with other modes/browsers
      - Easy to remove + wipe with external tools
    - click-to-play for "troublesome" features
Mike Perry's avatar
Mike Perry committed
131
   - Philosophy
132
133
134
    - No filters
-->

Mike Perry's avatar
Mike Perry committed
135
136
<sect1 id="DesignRequirements">
  <title>Design Requirements and Philosophy</title>
137
138
  <para>

139
The Tor Browser Design Requirements are meant to describe the properties of a
Mike Perry's avatar
Mike Perry committed
140
Private Browsing Mode that defends against both network and local forensic
141
adversaries.
142
143
144
145

  </para>
  <para>

146
147
148
There are two main categories of requirements: <link
linkend="security">Security Requirements</link>, and <link
linkend="privacy">Privacy Requirements</link>. Security Requirements are the
149
150
minimum properties in order for a browser to be able to support Tor and
similar privacy proxies safely. Privacy requirements are the set of properties
151
that cause us to prefer one browser over another.
152
153
154
155
156
157
158

  </para>
  <para>

While we will endorse the use of browsers that meet the security requirements,
it is primarily the privacy requirements that cause us to maintain our own
browser distribution.
159

Mike Perry's avatar
Mike Perry committed
160
161
  </para>
  <para>
162

163
164
165
166
      The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
      NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
      "OPTIONAL" in this document are to be interpreted as described in
      <ulink url="https://www.ietf.org/rfc/rfc2119.txt">RFC 2119</ulink>.
167
168

  </para>
169

170
171
172
173
  <sect2 id="security">
   <title>Security Requirements</title>
   <para>

Mike Perry's avatar
Mike Perry committed
174
175
The security requirements are primarily concerned with ensuring the safe use
of Tor. Violations in these properties typically result in serious risk for
176
the user in terms of immediate deanonymization and/or observability. With
Mike Perry's avatar
Mike Perry committed
177
178
respect to browser support, security requirements are the minimum properties
in order for Tor to support the use of a particular browser.
Mike Perry's avatar
Mike Perry committed
179

180
181
   </para>

182
<orderedlist>
183
184
 <listitem><link linkend="proxy-obedience"><command>Proxy
Obedience</command></link>
185
186
187
 <para>The browser
MUST NOT bypass Tor proxy settings for any content.</para></listitem>

188
189
 <listitem><link linkend="state-separation"><command>State
Separation</command></link>
Mike Perry's avatar
Mike Perry committed
190

191
192
193
194
 <para>

The browser MUST NOT provide the content window with any state from any other
browsers or any non-Tor browsing modes. This includes shared state from
195
independent plugins, and shared state from operating system implementations of
196
197
TLS and other support libraries.

198
199
</para></listitem>

200
201
202
203
 <listitem><link linkend="disk-avoidance"><command>Disk
Avoidance</command></link>

<para>
204

205
206
207
208
209
210
The browser MUST NOT write any information that is derived from or that
reveals browsing activity to the disk, or store it in memory beyond the
duration of one browsing session, unless the user has explicitly opted to
store their browsing history information to disk.

</para></listitem>
211
212
213
214
 <listitem><link linkend="app-data-isolation"><command>Application Data
Isolation</command></link>

<para>
215

Mike Perry's avatar
Mike Perry committed
216
The components involved in providing private browsing MUST be self-contained,
217
218
219
220
or MUST provide a mechanism for rapid, complete removal of all evidence of the
use of the mode. In other words, the browser MUST NOT write or cause the
operating system to write <emphasis>any information</emphasis> about the use
of private browsing to disk outside of the application's control. The user
Mike Perry's avatar
Mike Perry committed
221
must be able to ensure that secure deletion of the software is sufficient to
222
remove evidence of the use of the software. All exceptions and shortcomings
Mike Perry's avatar
Mike Perry committed
223
due to operating system behavior MUST be wiped by an uninstaller. However, due
Mike Perry's avatar
Mike Perry committed
224
to permissions issues with access to swap, implementations MAY choose to leave
225
it out of scope, and/or leave it to the operating system/platform to implement
Mike Perry's avatar
Mike Perry committed
226
ephemeral-keyed encrypted swap.
227
228

</para></listitem>
229

230
</orderedlist>
231

232
233
  </sect2>

Mike Perry's avatar
Mike Perry committed
234
  <sect2 id="privacy">
235
236
237
   <title>Privacy Requirements</title>
   <para>

Mike Perry's avatar
Mike Perry committed
238
The privacy requirements are primarily concerned with reducing linkability:
239
240
the ability for a user's activity on one site to be linked with their activity
on another site without their knowledge or explicit consent. With respect to
Mike Perry's avatar
Mike Perry committed
241
browser support, privacy requirements are the set of properties that cause us
242
to prefer one browser over another.
Mike Perry's avatar
Mike Perry committed
243

244
245
   </para>

246
247
248
249
   <para>

For the purposes of the unlinkability requirements of this section as well as
the descriptions in the <link linkend="Implementation">implementation
Mike Perry's avatar
Mike Perry committed
250
section</link>, a <command>URL bar origin</command> means at least the
251
second-level DNS name.  For example, for mail.google.com, the origin would be
Mike Perry's avatar
Mike Perry committed
252
google.com. Implementations MAY, at their option, restrict the URL bar origin
Mike Perry's avatar
Mike Perry committed
253
to be the entire fully qualified domain name.
254
255
256

   </para>

257
<orderedlist>
258
 <listitem><link linkend="identifier-linkability"><command>Cross-Origin
259
Identifier Unlinkability</command></link>
260
261
  <para>

Mike Perry's avatar
Mike Perry committed
262
263
User activity on one URL bar origin MUST NOT be linkable to their activity in
any other URL bar origin by any third party automatically or without user
Mike Perry's avatar
Mike Perry committed
264
265
266
interaction or approval. This requirement specifically applies to linkability
from stored browser identifiers, authentication tokens, and shared state. The
requirement does not apply to linkable information the user manually submits
Mike Perry's avatar
Mike Perry committed
267
to sites, or due to information submitted during manual link traversal. This
268
269
functionality SHOULD NOT interfere with interactive, click-driven federated
login in a substantial way.
270
271
272

  </para>
 </listitem>
273
 <listitem><link linkend="fingerprinting-linkability"><command>Cross-Origin
274
Fingerprinting Unlinkability</command></link>
275
276
  <para>

Mike Perry's avatar
Mike Perry committed
277
278
User activity on one URL bar origin MUST NOT be linkable to their activity in
any other URL bar origin by any third party. This property specifically applies to
279
280
281
282
linkability from fingerprinting browser behavior.

  </para>
 </listitem>
283
284
 <listitem><link linkend="new-identity"><command>Long-Term
Unlinkability</command></link>
Mike Perry's avatar
Mike Perry committed
285
286
  <para>

287
288
The browser MUST provide an obvious, easy way for the user to remove all of
its authentication tokens and browser state and obtain a fresh identity.
289
290
Additionally, the browser SHOULD clear linkable state by default automatically
upon browser restart, except at user option.
Mike Perry's avatar
Mike Perry committed
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305

  </para>
 </listitem>
</orderedlist>

  </sect2>
  <sect2 id="philosophy">
  <title>Philosophy</title>
   <para>

In addition to the above design requirements, the technology decisions about
Tor Browser are also guided by some philosophical positions about technology.

   </para>
   <orderedlist>
Mike Perry's avatar
Mike Perry committed
306
     <listitem><command>Preserve existing user model</command>
Mike Perry's avatar
Mike Perry committed
307
308
309
310
311
312
      <para>

The existing way that the user expects to use a browser must be preserved. If
the user has to maintain a different mental model of how the sites they are
using behave depending on tab, browser state, or anything else that would not
normally be what they experience in their default browser, the user will
Mike Perry's avatar
Mike Perry committed
313
314
inevitably be confused. They will make mistakes and reduce their privacy as a
result. Worse, they may just stop using the browser, assuming it is broken.
Mike Perry's avatar
Mike Perry committed
315
316
317
318

      </para>
      <para>

Mike Perry's avatar
Mike Perry committed
319
320
321
322
323
User model breakage was one of the <ulink
url="https://blog.torproject.org/blog/toggle-or-not-toggle-end-torbutton">failures
of Torbutton</ulink>: Even if users managed to install everything properly,
the toggle model was too hard for the average user to understand, especially
in the face of accumulating tabs from multiple states crossed with the current
324
Tor-state of the browser.
Mike Perry's avatar
Mike Perry committed
325
326
327

      </para>
     </listitem>
328
329
     <listitem><command>Favor the implementation mechanism least likely to
break sites</command>
Mike Perry's avatar
Mike Perry committed
330
331
      <para>

Mike Perry's avatar
Mike Perry committed
332
333
In general, we try to find solutions to privacy issues that will not induce
site breakage, though this is not always possible.
Mike Perry's avatar
Mike Perry committed
334
335
336

      </para>
     </listitem>
Mike Perry's avatar
Mike Perry committed
337
338
     <listitem><command>Plugins must be restricted</command>
      <para>
339

Mike Perry's avatar
Mike Perry committed
340
Even if plugins always properly used the browser proxy settings (which none of
341
them do) and could not be induced to bypass them (which all of them can), the
Mike Perry's avatar
Mike Perry committed
342
activities of closed-source plugins are very difficult to audit and control.
343
344
345
346
They can obtain and transmit all manner of system information to websites,
often have their own identifier storage for tracking users, and also
contribute to fingerprinting.

Mike Perry's avatar
Mike Perry committed
347
348
349
350
351
352
      </para>
      <para>

Therefore, if plugins are to be enabled in private browsing modes, they must
be restricted from running automatically on every page (via click-to-play
placeholders), and/or be sandboxed to restrict the types of system calls they
353
can execute. If the user agent allows the user to craft an exemption to allow
Mike Perry's avatar
Mike Perry committed
354
a plugin to be used automatically, it must only apply to the top level URL bar
355
356
domain, and not to all sites, to reduce cross-origin fingerprinting
linkability.
357

Mike Perry's avatar
Mike Perry committed
358
359
360
361
       </para>
     </listitem>
     <listitem><command>Minimize Global Privacy Options</command>
      <para>
362

Mike Perry's avatar
Mike Perry committed
363
<ulink url="https://trac.torproject.org/projects/tor/ticket/3100">Another
364
failure of Torbutton</ulink> was the options panel. Each option
365
366
that detectably alters browser behavior can be used as a fingerprinting tool.
Similarly, all extensions <ulink
367
url="https://blog.chromium.org/2010/06/extensions-in-incognito.html">should be
368
disabled in the mode</ulink> except as an opt-in basis. We should not load
369
system-wide and/or operating system provided addons or plugins.
370

Mike Perry's avatar
Mike Perry committed
371
372
     </para>
     <para>
373
Instead of global browser privacy options, privacy decisions should be made
Mike Perry's avatar
Mike Perry committed
374
375
<ulink
url="https://wiki.mozilla.org/Privacy/Features/Site-based_data_management_UI">per
Mike Perry's avatar
Mike Perry committed
376
URL bar origin</ulink> to eliminate the possibility of linkability
377
between domains. For example, when a plugin object (or a JavaScript access of
Mike Perry's avatar
Mike Perry committed
378
window.plugins) is present in a page, the user should be given the choice of
Mike Perry's avatar
Mike Perry committed
379
allowing that plugin object for that URL bar origin only. The same
Mike Perry's avatar
Mike Perry committed
380
goes for exemptions to third party cookie policy, geolocation, and any other
Mike Perry's avatar
Mike Perry committed
381
382
383
privacy permissions.
     </para>
     <para>
384
If the user has indicated they wish to record local history storage, these
385
permissions can be written to disk. Otherwise, they should remain memory-only.
Mike Perry's avatar
Mike Perry committed
386
     </para>
Mike Perry's avatar
Mike Perry committed
387
     </listitem>
Mike Perry's avatar
Mike Perry committed
388
389
390
     <listitem><command>No filters</command>
      <para>

391
Site-specific or filter-based addons such as <ulink
Mike Perry's avatar
Mike Perry committed
392
url="https://addons.mozilla.org/en-US/firefox/addon/adblock-plus/">AdBlock
393
394
Plus</ulink>, <ulink url="https://requestpolicy.com/">Request Policy</ulink>,
<ulink url="https://www.ghostery.com/about-ghostery/">Ghostery</ulink>, <ulink
Mike Perry's avatar
Mike Perry committed
395
url="http://priv3.icsi.berkeley.edu/">Priv3</ulink>, and <ulink
396
url="https://sharemenot.cs.washington.edu/">Sharemenot</ulink> are to be
Mike Perry's avatar
Mike Perry committed
397
398
avoided. We believe that these addons do not add any real privacy to a proper
<link linkend="Implementation">implementation</link> of the above <link
399
linkend="privacy">privacy requirements</link>, and that development efforts
400
401
should be focused on general solutions that prevent tracking by all third
parties, rather than a list of specific URLs or hosts.
402
403
     </para>
     <para>
404
405
Implementing filter-based blocking directly into the browser, such as done with
<ulink
406
url="https://ieee-security.org/TC/SPW2015/W2SP/papers/W2SP_2015_submission_32.pdf">
407
Firefox' Tracking Protection</ulink>, does not alleviate the concerns mentioned
408
in the previous paragraph. There is still just a list containing specific
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
URLs and hosts which, in this case, are
<ulink url="https://services.disconnect.me/disconnect-plaintext.json">
assembled</ulink> by <ulink url="https://disconnect.me/trackerprotection">
Disconnect</ulink> and <ulink url="https://github.com/mozilla-services/shavar-list-exceptions">adapted</ulink> by Mozilla.
     </para>
     <para>
Trying to resort to <ulink
url="https://jonathanmayer.org/papers_data/bau13.pdf">filter methods based on
machine learning</ulink> does not solve the problem either: they don't provide
a general solution to the tracking problem as they are working probabilistically.
Even with a precision rate at 99% and a false positive rate at 0.1% trackers
would be missed and sites would be wrongly blocked.
     </para>
     <para>
Filter-based solutions in general can also introduce strange breakage and cause
424
425
426
427
usability nightmares. For instance, there is a trend to observe that websites
start <ulink url="https://petsymposium.org/2017/papers/issue3/paper25-2017-3-source.pdf">
detecting filer extensions and block access to content</ulink> on them. Coping
with this fallout easily leads to just <ulink
428
429
url="https://github.com/mozilla-services/shavar-list-exceptions">whitelisting
</ulink>
430
431
432
433
the affected domains, hoping that this helps, defeating the purpose of the
filter in the first place. Filters will also fail to do their job if an
adversary simply registers a new domain or <ulink
url="https://ieee-security.org/TC/SPW2015/W2SP/papers/W2SP_2015_submission_24.pdf">
434
435
creates a new URL path</ulink>. Worse still, the unique filter sets that each
user creates or installs will provide a wealth of fingerprinting targets.
Mike Perry's avatar
Mike Perry committed
436
      </para>
437
      <para>
Mike Perry's avatar
Mike Perry committed
438
439
440
441

As a general matter, we are also generally opposed to shipping an always-on Ad
blocker with Tor Browser. We feel that this would damage our credibility in
terms of demonstrating that we are providing privacy through a sound design
442
alone, as well as damage the acceptance of Tor users by sites that support
Mike Perry's avatar
Mike Perry committed
443
444
themselves through advertising revenue.

445
446
447
448
449
      </para>
      <para>
Users are free to install these addons if they wish, but doing
so is not recommended, as it will alter the browser request fingerprint.
      </para>
Mike Perry's avatar
Mike Perry committed
450
451
452
     </listitem>
     <listitem><command>Stay Current</command>
      <para>
Mike Perry's avatar
Mike Perry committed
453
454
We believe that if we do not stay current with the support of new web
technologies, we cannot hope to substantially influence or be involved in
455
their proper deployment or privacy realization. However, we will likely disable
Mike Perry's avatar
Mike Perry committed
456
high-risk features pending analysis, audit, and mitigation.
Mike Perry's avatar
Mike Perry committed
457
458
      </para>
     </listitem>
459
<!--
Mike Perry's avatar
Mike Perry committed
460
     <listitem><command>Transparency in Navigation Tracking</command>
461
      <para>
462

Mike Perry's avatar
Mike Perry committed
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
While we believe it is possible to restrict third party tracking with only
minimal site breakage, it is our long-term goal to further reduce cross-origin
click navigation tracking to mechanisms that are detectable by experts and
attentive users, so they can alert the general public if cross-origin
click navigation tracking is happening where it should not be.

      </para>
      <para>

However, the entrenched nature of certain archaic web features make it
impossible for us to achieve this wider goal by ourselves without substantial
site breakage. So, instead we maintain a <link linkend="deprecate">Deprecation
Wishlist</link> of archaic web technologies that are currently being (ab)used
to facilitate federated login and other legitimate click-driven cross-domain
activity but that can one day be replaced with more privacy friendly,
auditable alternatives.
479

480
481
      </para>
     </listitem>
482
-->
Mike Perry's avatar
Mike Perry committed
483
   </orderedlist>
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
  </sect2>
</sect1>

<!--
- Implementation
  - Section Template
    - Sub Section
      - "Design Goal":
      - "Implementation Status"
  - Local Privacy
  - Linkability
    - Stored State
      - Cookies
      - Cache
      - DOM Storage
      - HTTP Auth
      - SSL state
    - Plugins
    - Fingerprinting
      - Location + timezone is part of this
  - Patches?
-->
506
507
508
509
510
  <sect1 id="adversary">
   <title>Adversary Model</title>
   <para>

A Tor web browser adversary has a number of goals, capabilities, and attack
Mike Perry's avatar
Mike Perry committed
511
types that can be used to illustrate the design requirements for the
512
513
514
Tor Browser. Let's start with the goals.

   </para>
Mike Perry's avatar
Mike Perry committed
515
   <sect2 id="adversary-goals">
516
517
518
519
520
    <title>Adversary Goals</title>
    <orderedlist>
<!-- These aren't really commands.. But it's the closest I could find in an
acceptable style.. Don't really want to make my own stylesheet -->
     <listitem><command>Bypassing proxy settings</command>
521
     <para>The adversary's primary goal is direct compromise and bypass of
522
523
524
525
526
527
528
Tor, causing the user to directly connect to an IP of the adversary's
choosing.</para>
     </listitem>
     <listitem><command>Correlation of Tor vs Non-Tor Activity</command>
     <para>If direct proxy bypass is not possible, the adversary will likely
happily settle for the ability to correlate something a user did via Tor with
their non-Tor activity. This can be done with cookies, cache identifiers,
529
JavaScript events, and even CSS. Sometimes the fact that a user uses Tor may
530
531
532
533
534
535
536
537
538
be enough for some authorities.</para>
     </listitem>
     <listitem><command>History disclosure</command>
     <para>
The adversary may also be interested in history disclosure: the ability to
query a user's history to see if they have issued certain censored search
queries, or visited censored sites.
     </para>
     </listitem>
539
     <listitem><command>Correlate activity across multiple sites</command>
540
541
     <para>

542
543
544
545
546
547
548
549
550
551
552
The primary goal of the advertising networks is to know that the user who
visited siteX.com is the same user that visited siteY.com to serve them
targeted ads. The advertising networks become our adversary insofar as they
attempt to perform this correlation without the user's explicit consent.

     </para>
     </listitem>
     <listitem><command>Fingerprinting/anonymity set reduction</command>
     <para>

Fingerprinting (more generally: "anonymity set reduction") is used to attempt
553
to gather identifying information on a particular individual without the use
554
of tracking identifiers. If the dissident's or whistleblower's timezone is
555
556
557
558
available, and they are using a rare build of Firefox for an obscure operating
system, and they have a specific display resolution only used on one type of
laptop, this can be very useful information for tracking them down, or at
least <link linkend="fingerprinting">tracking their activities</link>.
559
560
561
562
563
564

     </para>
     </listitem>
     <listitem><command>History records and other on-disk
information</command>
     <para>
565

566
567
568
In some cases, the adversary may opt for a heavy-handed approach, such as
seizing the computers of all Tor users in an area (especially after narrowing
the field by the above two pieces of information). History records and cache
569
data are the primary goals here. Secondary goals may include confirming
Mike Perry's avatar
Mike Perry committed
570
on-disk identifiers (such as hostname and disk-logged spoofed MAC address
571
572
history) obtained by other means.

573
574
575
576
577
     </para>
     </listitem>
    </orderedlist>
   </sect2>

Mike Perry's avatar
Mike Perry committed
578
   <sect2 id="adversary-positioning">
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
    <title>Adversary Capabilities - Positioning</title>
    <para>
The adversary can position themselves at a number of different locations in
order to execute their attacks.
    </para>
    <orderedlist>
     <listitem><command>Exit Node or Upstream Router</command>
     <para>
The adversary can run exit nodes, or alternatively, they may control routers
upstream of exit nodes. Both of these scenarios have been observed in the
wild.
     </para>
     </listitem>
     <listitem><command>Ad servers and/or Malicious Websites</command>
     <para>
The adversary can also run websites, or more likely, they can contract out
ad space from a number of different ad servers and inject content that way. For
some users, the adversary may be the ad servers themselves. It is not
597
inconceivable that ad servers may try to subvert or reduce a user's anonymity
598
599
600
601
602
603
604
605
through Tor for marketing purposes.
     </para>
     </listitem>
     <listitem><command>Local Network/ISP/Upstream Router</command>
     <para>
The adversary can also inject malicious content at the user's upstream router
when they have Tor disabled, in an attempt to correlate their Tor and Non-Tor
activity.
606
607
608
609
610
     </para>
     <para>

Additionally, at this position the adversary can block Tor, or attempt to
recognize the traffic patterns of specific web pages at the entrance to the Tor
611
network.
612

613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
     </para>
     </listitem>
     <listitem><command>Physical Access</command>
     <para>
Some users face adversaries with intermittent or constant physical access.
Users in Internet cafes, for example, face such a threat. In addition, in
countries where simply using tools like Tor is illegal, users may face
confiscation of their computer equipment for excessive Tor usage or just
general suspicion.
     </para>
     </listitem>
    </orderedlist>
   </sect2>

   <sect2 id="attacks">
    <title>Adversary Capabilities - Attacks</title>
    <para>

631
The adversary can perform the following attacks from a number of different
632
633
positions to accomplish various aspects of their goals. It should be noted
that many of these attacks (especially those involving IP address leakage) are
634
often performed by accident by websites that simply have JavaScript, dynamic
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
CSS elements, and plugins. Others are performed by ad servers seeking to
correlate users' activity across different IP addresses, and still others are
performed by malicious agents on the Tor network and at national firewalls.

    </para>
    <orderedlist>
     <listitem><command>Read and insert identifiers</command>
     <para>

The browser contains multiple facilities for storing identifiers that the
adversary creates for the purposes of tracking users. These identifiers are
most obviously cookies, but also include HTTP auth, DOM storage, cached
scripts and other elements with embedded identifiers, client certificates, and
even TLS Session IDs.

     </para>
     <para>

An adversary in a position to perform MITM content alteration can inject
document content elements to both read and inject cookies for arbitrary
domains. In fact, even many "SSL secured" websites are vulnerable to this sort of
<ulink url="http://seclists.org/bugtraq/2007/Aug/0070.html">active
sidejacking</ulink>. In addition, the ad networks of course perform tracking
with cookies as well.

660
661
662
663
     </para>
     <para>

These types of attacks are attempts at subverting our <link
664
linkend="identifier-linkability">Cross-Origin Identifier Unlinkability</link> and <link
Mike Perry's avatar
Mike Perry committed
665
linkend="new-identity">Long-Term Unlinkability</link> design requirements.
666

667
668
669
670
671
672
673
     </para>
     </listitem>
     <listitem id="fingerprinting"><command>Fingerprint users based on browser
attributes</command>
<para>

There is an absurd amount of information available to websites via attributes
674
675
676
677
of the browser. This information can be used to reduce the anonymity set, or
even uniquely fingerprint individual users. Attacks of this nature are
typically aimed at tracking users across sites without their consent, in an
attempt to subvert our <link linkend="fingerprinting-linkability">Cross-Origin
678
Fingerprinting Unlinkability</link> and <link
Mike Perry's avatar
Mike Perry committed
679
linkend="new-identity">Long-Term Unlinkability</link> design requirements.
680
681
682
683
684

</para>

<para>

685
686
687
Fingerprinting is an intimidating problem to attempt to tackle, especially
without a metric to determine or at least intuitively understand and estimate
which features will most contribute to linkability between visits.
688
689
690
691
692

</para>

<para>

693
The <ulink url="https://panopticlick.eff.org/about">Panopticlick study
694
695
696
697
698
699
done</ulink> by the EFF uses the <ulink
url="https://en.wikipedia.org/wiki/Entropy_%28information_theory%29">Shannon
entropy</ulink> - the number of identifying bits of information encoded in
browser properties - as this metric. Their <ulink
url="https://wiki.mozilla.org/Fingerprinting#Data">result data</ulink> is
definitely useful, and the metric is probably the appropriate one for
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
determining how identifying a particular browser property is. However, some
quirks of their study means that they do not extract as much information as
they could from display information: they only use desktop resolution and do
not attempt to infer the size of toolbars. In the other direction, they may be
over-counting in some areas, as they did not compute joint entropy over
multiple attributes that may exhibit a high degree of correlation. Also, new
browser features are added regularly, so the data should not be taken as
final.

      </para>
     <para>

Despite the uncertainty, all fingerprinting attacks leverage the following
attack vectors:

     </para>
     <orderedlist>
     <listitem><command>Observing Request Behavior</command>
     <para>

Properties of the user's request behavior comprise the bulk of low-hanging
fingerprinting targets. These include: User agent, Accept-* headers, pipeline
usage, and request ordering. Additionally, the use of custom filters such as
AdBlock and other privacy filters can be used to fingerprint request patterns
(as an extreme example).

     </para>
     </listitem>

729
     <listitem><command>Inserting JavaScript</command>
730
731
     <para>

732
JavaScript can reveal a lot of fingerprinting information. It provides DOM
733
objects such as window.screen and window.navigator to extract information
734
about the user agent.
735

736
Also, JavaScript can be used to query the user's timezone via the
737
738
739
740
<function>Date()</function> object, <ulink
url="https://www.khronos.org/registry/webgl/specs/1.0/#5.13">WebGL</ulink> can
reveal information about the video card in use, and high precision timing
information can be used to <ulink
741
url="https://cseweb.ucsd.edu/~hovav/dist/jspriv.pdf">fingerprint the CPU and
742
743
744
745
746
747
748
749
750
interpreter speed</ulink>. JavaScript features such as
<ulink url="https://www.w3.org/TR/resource-timing/">Resource Timing</ulink>
may leak an unknown amount of network timing related information. And, moreover,
JavaScript is able to
<ulink url="https://seclab.cs.ucsb.edu/media/uploads/papers/sp2013_cookieless.pdf">
extract</ulink>
<ulink url="https://www.cosic.esat.kuleuven.be/fpdetective/">available</ulink>
<ulink url="https://hal.inria.fr/hal-01285470v2/document">fonts</ulink> on a
device with high precision.
751
752
753
754
755
756
757
758
759
760
761
762
763

     </para>
     </listitem>

     <listitem><command>Inserting Plugins</command>
     <para>

The Panopticlick project found that the mere list of installed plugins (in
navigator.plugins) was sufficient to provide a large degree of
fingerprintability. Additionally, plugins are capable of extracting font lists,
interface addresses, and other machine information that is beyond what the
browser would normally provide to content. In addition, plugins can be used to
store unique identifiers that are more difficult to clear than standard
764
cookies.  <ulink url="https://epic.org/privacy/cookies/flash.html">Flash-based
765
766
cookies</ulink> fall into this category, but there are likely numerous other
examples. Beyond fingerprinting, plugins are also abysmal at obeying the proxy
767
settings of the browser.
768
769
770
771
772
773
774
775
776
777


     </para>
     </listitem>
     <listitem><command>Inserting CSS</command>
     <para>

<ulink url="https://developer.mozilla.org/En/CSS/Media_queries">CSS media
queries</ulink> can be inserted to gather information about the desktop size,
widget size, display type, DPI, user agent type, and other information that
778
was formerly available only to JavaScript.
779
780
781
782
783

     </para>
     </listitem>
     </orderedlist>
     </listitem>
784
785
786
787
     <listitem id="website-traffic-fingerprinting"><command>Website traffic fingerprinting</command>
     <para>

Website traffic fingerprinting is an attempt by the adversary to recognize the
788
789
790
encrypted traffic patterns of specific websites. In the case of Tor, this
attack would take place between the user and the Guard node, or at the Guard
node itself.
791
     </para>
792

793
     <para> The most comprehensive study of the statistical properties of this
794
attack against Tor was done by <ulink
795
url="https://lorre.uni.lu/~andriy/papers/acmccs-wpes11-fingerprinting.pdf">Panchenko
796
et al</ulink>. Unfortunately, the publication bias in academia has encouraged
797
798
799
800
801
802
803
804
805
806
the production of
<ulink url="https://blog.torproject.org/blog/critique-website-traffic-fingerprinting-attacks">a
number of follow-on attack papers claiming "improved" success rates</ulink>, in
some cases even claiming to completely invalidate any attempt at defense. These
"improvements" are actually enabled primarily by taking a number of shortcuts
(such as classifying only very small numbers of web pages, neglecting to publish
ROC curves or at least false positive rates, and/or omitting the effects of
dataset size on their results). Despite these subsequent "improvements", we are
skeptical of the efficacy of this attack in a real world scenario,
<emphasis>especially</emphasis> in the face of any defenses.
807
808
809
810

     </para>
     <para>

811
812
813
814
815
816
In general, with machine learning, as you increase the <ulink
url="https://en.wikipedia.org/wiki/VC_dimension">number and/or complexity of
categories to classify</ulink> while maintaining a limit on reliable feature
information you can extract, you eventually run out of descriptive feature
information, and either true positive accuracy goes down or the false positive
rate goes up. This error is called the <ulink
817
url="https://www.cs.washington.edu/education/courses/csep573/98sp/lectures/lecture8/sld050.htm">bias
818
819
820
821
in your hypothesis space</ulink>. In fact, even for unbiased hypothesis
spaces, the number of training examples required to achieve a reasonable error
bound is <ulink
url="https://en.wikipedia.org/wiki/Probably_approximately_correct_learning#Equivalence">a
822
function of the complexity of the categories</ulink> you need to classify.
823
824
825
826
827
828

     </para>
      <para>


In the case of this attack, the key factors that increase the classification
829
complexity (and thus hinder a real world adversary who attempts this attack)
830
are large numbers of dynamically generated pages, partially cached content,
831
832
and also the non-web activity of the entire Tor network. This yields an
effective number of "web pages" many orders of magnitude larger than even <ulink
833
url="https://lorre.uni.lu/~andriy/papers/acmccs-wpes11-fingerprinting.pdf">Panchenko's
Mike Perry's avatar
Mike Perry committed
834
"Open World" scenario</ulink>, which suffered continuous near-constant decline
835
836
in the true positive rate as the "Open World" size grew (see figure 4). This
large level of classification complexity is further confounded by a noisy and
Mike Perry's avatar
Mike Perry committed
837
low resolution featureset - one which is also relatively easy for the defender
838
to manipulate at low cost.
839
840
841
842

     </para>
     <para>

Mike Perry's avatar
Mike Perry committed
843
844
845
846
847
848
To make matters worse for a real-world adversary, the ocean of Tor Internet
activity (at least, when compared to a lab setting) makes it a certainty that
an adversary attempting examine large amounts of Tor traffic will ultimately
be overwhelmed by false positives (even after making heavy tradeoffs on the
ROC curve to minimize false positives to below 0.01%). This problem is known
in the IDS literature as the <ulink
849
850
851
url="http://www.raid-symposium.org/raid99/PAPERS/Axelsson.pdf">Base Rate
Fallacy</ulink>, and it is the primary reason that anomaly and activity
classification-based IDS and antivirus systems have failed to materialize in
852
the marketplace (despite early success in academic literature).
853
854
855
856
857
858
859
860
861
862
863
864

     </para>
     <para>

Still, we do not believe that these issues are enough to dismiss the attack
outright. But we do believe these factors make it both worthwhile and
effective to <link linkend="traffic-fingerprinting-defenses">deploy
light-weight defenses</link> that reduce the accuracy of this attack by
further contributing noise to hinder successful feature extraction.

     </para>
     </listitem>
865
866
867
868
869
870
871
     <listitem><command>Remotely or locally exploit browser and/or
OS</command>
     <para>

Last, but definitely not least, the adversary can exploit either general
browser vulnerabilities, plugin vulnerabilities, or OS vulnerabilities to
install malware and surveillance software. An adversary with physical access
Mike Perry's avatar
Mike Perry committed
872
873
874
875
876
877
878
can perform similar actions.

    </para>
    <para>

For the purposes of the browser itself, we limit the scope of this adversary
to one that has passive forensic access to the disk after browsing activity
879
has taken place. This adversary motivates our
880
<link linkend="disk-avoidance">Disk Avoidance</link> defenses.
881

Mike Perry's avatar
Mike Perry committed
882
883
884
885
886
887
    </para>
    <para>

An adversary with arbitrary code execution typically has more power, though.
It can be quite hard to really significantly limit the capabilities of such an
adversary. <ulink
888
url="https://tails.boum.org/contribute/design/">The Tails system</ulink> can
Mike Perry's avatar
Mike Perry committed
889
890
891
892
provide some defense against this adversary through the use of readonly media
and frequent reboots, but even this can be circumvented on machines without
Secure Boot through the use of BIOS rootkits.

893
894
895
896
897
898
     </para>
     </listitem>
    </orderedlist>
   </sect2>

</sect1>
899
900
901
902

<sect1 id="Implementation">
  <title>Implementation</title>
  <para>
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919

The Implementation section is divided into subsections, each of which
corresponds to a <link linkend="DesignRequirements">Design Requirement</link>.
Each subsection is divided into specific web technologies or properties. The
implementation is then described for that property.

  </para>
  <para>

In some cases, the implementation meets the design requirements in a non-ideal
way (for example, by disabling features). In rare cases, there may be no
implementation at all. Both of these cases are denoted by differentiating
between the <command>Design Goal</command> and the <command>Implementation
Status</command> for each property. Corresponding bugs in the <ulink
url="https://trac.torproject.org/projects/tor/report">Tor bug tracker</ulink>
are typically linked for these cases.

920
921
922
923
924
925
926
  </para>
  <sect2 id="proxy-obedience">
   <title>Proxy Obedience</title>
   <para>

Proxy obedience is assured through the following:
   </para>
927
<orderedlist>
Mike Perry's avatar
Mike Perry committed
928
 <listitem><command>Firefox proxy settings, patches, and build flags</command>
929
 <para>
930

931
Our <ulink
932
url="https://gitweb.torproject.org/tor-browser.git/tree/browser/app/profile/000-tor-browser.js?h=tor-browser-52.5.2esr-7.0-2">Firefox
933
934
preferences file</ulink> sets the Firefox proxy settings to use Tor directly
as a SOCKS proxy. It sets <command>network.proxy.socks_remote_dns</command>,
935
936
937
<command>network.proxy.socks_version</command>,
<command>network.proxy.socks_port</command>, and
<command>network.dns.disablePrefetch</command>.
938

939
 </para>
Mike Perry's avatar
Mike Perry committed
940
941
 <para>

942
943
944
To prevent proxy bypass by WebRTC calls, we disable WebRTC at compile time
with the <command>--disable-webrtc</command> configure switch, as well
as set the pref <command>media.peerconnection.enabled</command> to false.
945
946
947
948

 </para>
 <para>

949
950
We also patch Firefox in order to provide several defense-in-depth mechanisms
for proxy safety. Notably, we <ulink
951
url="https://gitweb.torproject.org/tor-browser.git/commit/?h=tor-browser-52.5.2esr-7.0-2&amp;id=35ce9974e034c0374fb3c8e00e9eb0231c4f3378">patch
952
the DNS service</ulink> to prevent any browser or addon DNS resolution, and we
953
also <ulink url="https://gitweb.torproject.org/tor-browser.git/commit/?h=tor-browser-52.5.2esr-7.0-2&amp;id=ee28d8f27fdb1e47481987535c7da70095042ee2">
954
955
remove the DNS lookup for the profile lock signature</ulink>. Furhermore, we
<ulink
956
url="https://gitweb.torproject.org/tor-browser.git/commit/?h=tor-browser-52.5.2esr-7.0-2&amp;id=ffba8d1b84431b4024d5012b326cbcb986047f27">patch
957
958
959
960
961
962
OCSP and PKIX code</ulink> to prevent any use of the non-proxied command-line
tool utility functions from being functional while linked in to the browser.
In both cases, we could find no direct paths to these routines in the browser,
but it seemed better safe than sorry.

 </para>
963
964
965

 <para>

966
For further defense-in-depth we disable WebIDE because it can bypass proxy
967
968
969
970
971
972
973
974
settings for remote debugging, and also because it downloads extensions we
have not reviewed. We
are doing this by setting
<command>devtools.webide.autoinstallADBHelper</command>,
<command>devtools.webide.autoinstallFxdtAdapters</command>,
<command>devtools.webide.enabled</command>, and
<command>devtools.appmanager.enabled</command> to <command>false</command>.
Moreover, we removed the Roku Screen Sharing and screencaster code with a
975
<ulink url="https://gitweb.torproject.org/tor-browser.git/commit/?h=tor-browser-52.5.2esr-7.0-2&amp;id=055bdffbef68bc8d5e8005b3c7dd2f5d99da1163">
976
977
978
Firefox patch</ulink> as these features can bypass proxy settings as well.
 </para>

979
 <para>
980
981
982
983
Further down on our road to proxy safety we <ulink url="https://gitweb.torproject.org/tor-browser.git/commit/?h=tor-browser-52.5.2esr-7.0-2&amp;id=7222d02638689a64d7297b8e5c202f9c37547523">
disable the network tickler</ulink> as it has the capability to send UDP
traffic and we <ulink url="https://gitweb.torproject.org/tor-browser.git/commit/?h=tor-browser-52.5.2esr-7.0-2&amp;id=5bc957b4f635a659f9aecaa374972ecca7f770a8">
disable mDNS support</ulink>, since mDNS uses UDP packets as well. We also disable
984
985
986
987
988
Mozilla's TCPSocket by setting
<command>dom.mozTCPSocket.enabled</command> to <command>false</command>. We
<ulink url="https://trac.torproject.org/projects/tor/ticket/18866">intend to
rip out</ulink> the TCPSocket code in the future to have an even more solid
guarantee that it won't be used by accident.
989
 </para>
990

991
992
993
 <para>
Finally, we <ulink url="https://gitweb.torproject.org/tor-browser.git/commit/?h=tor-browser-52.5.2esr-7.0-2&amp;id=55bd129f081bd37ae9e72ae32434fbb56ff4e446">
remove</ulink> potentially unsafe Rust code.
994
995
996
 </para>

 <para>
997
During every Extended Support Release transition, we perform <ulink
998
url="https://gitweb.torproject.org/tor-browser-spec.git/tree/audits">in-depth
999
1000
code audits</ulink> to verify that there were no system calls or XPCOM
activity in the source tree that did not use the browser proxy settings.