GitLab is used only for code review, issue tracking and project management. Canonical locations for source code are still https://gitweb.torproject.org/ https://git.torproject.org/ and git-rw.torproject.org.

README.md 17.9 KB
Newer Older
Rob Jansen's avatar
Rob Jansen committed
1
# OnionPerf
Karsten Loesing's avatar
Karsten Loesing committed
2 3

  * [Overview](#overview)
Ana Custura's avatar
Ana Custura committed
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
    + [What does OnionPerf do?](#what-does-onionperf-do-)
    + [What does OnionPerf *not* do?](#what-does-onionperf--not--do-)
  * [Installation](#installation)
    + [Tor](#tor)
    + [TGen](#tgen)
    + [OnionPerf](#onionperf-1)
  * [Measurement](#measurement)
    + [Starting and stopping measurements](#starting-and-stopping-measurements)
    + [Output directories and files](#output-directories-and-files)
    + [Changing Tor configurations](#changing-tor-configurations)
    + [Changing the TGen traffic model](#changing-the-tgen-traffic-model)
    + [Sharing measurement results](#sharing-measurement-results)
    + [Troubleshooting](#troubleshooting)
  * [Analysis](#analysis)
    + [Analyzing measurement results](#analyzing-measurement-results)
Ana Custura's avatar
Ana Custura committed
19
    + [Filtering measurement results](#filtering-measurement-results)
Ana Custura's avatar
Ana Custura committed
20 21 22
    + [Visualizing measurement results](#visualizing-measurement-results)
    + [Interpreting the PDF output format](#interpreting-the-pdf-output-format)
    + [Interpreting the CSV output format](#interpreting-the-csv-output-format)
Karsten Loesing's avatar
Karsten Loesing committed
23
    + [Visualizations on Tor Metrics](#visualizations-on-tor-metrics)
Ana Custura's avatar
Ana Custura committed
24
  * [Contributing](#contributing)
25

Karsten Loesing's avatar
Karsten Loesing committed
26
## Overview
27

Karsten Loesing's avatar
Karsten Loesing committed
28
### What does OnionPerf do?
29

Ana Custura's avatar
Ana Custura committed
30
OnionPerf measures performance of bulk file downloads over Tor. Together with its predecessor, Torperf, OnionPerf has been used to measure long-term performance trends in the Tor network since 2009. It is also being used to perform short-term performance experiments to compare different Tor configurations or implementations.
31

Karsten Loesing's avatar
Karsten Loesing committed
32
OnionPerf uses multiple processes and threads to download random data through Tor while tracking the performance of those downloads. The data is served and fetched on localhost using two TGen (traffic generator) processes, and is transferred through Tor using Tor client processes and an ephemeral Tor onion service. Tor control information and TGen performance statistics are logged to disk and analyzed once per day to produce a JSON analysis file that can later be used to visualize changes in Tor client performance over time.
33

Karsten Loesing's avatar
Karsten Loesing committed
34
### What does OnionPerf *not* do?
35

Karsten Loesing's avatar
Karsten Loesing committed
36
OnionPerf does not attempt to simulate complex traffic patterns like a web-browsing user or a voice-chatting user. It measures a very specific user model: a bulk 5 MiB file download over Tor.
37

Karsten Loesing's avatar
Karsten Loesing committed
38
OnionPerf does not interfere with how Tor selects paths and builds circuits, other than setting configuration values as specified by the user. As a result it cannot be used to measure specific relays nor to scan the entire Tor network.
39

Karsten Loesing's avatar
Karsten Loesing committed
40
## Installation
41

Ana Custura's avatar
Ana Custura committed
42
OnionPerf has several dependencies in order to perform measurements or analyze and visualize measurement results. These dependencies include Tor, TGen (traffic generator), and a few Python packages.
43

Karsten Loesing's avatar
Karsten Loesing committed
44
The following description was written with a Debian system in mind but should be transferable to other Linux distributions and possibly even other operating systems.
45

Karsten Loesing's avatar
Karsten Loesing committed
46
### Tor
47

48
OnionPerf relies on the `tor` binary to start a Tor process on the client side to make client requests and another Tor process on the server side to host onion services.
Karsten Loesing's avatar
Karsten Loesing committed
49

Ana Custura's avatar
Ana Custura committed
50
The easiest way to satisfy this dependency is to install the `tor` package, which puts the `tor` binary into the `PATH` where OnionPerf will find it. Optionally, systemd can be instructed to make sure that `tor` is never started as a service:
Karsten Loesing's avatar
Karsten Loesing committed
51 52 53 54 55

```shell
sudo apt install tor
sudo systemctl stop tor.service
sudo systemctl mask tor.service
Rob Jansen's avatar
Rob Jansen committed
56 57
```

Karsten Loesing's avatar
Karsten Loesing committed
58
Alternatively, Tor can be built from source:
59

Karsten Loesing's avatar
Karsten Loesing committed
60 61 62 63 64 65 66 67 68
```shell
sudo apt install automake build-essential libevent-dev libssl-dev zlib1g-dev
cd ~/
git clone https://git.torproject.org/tor.git
cd tor/
./autogen.sh
./configure --disable-asciidoc
make
```
69

Karsten Loesing's avatar
Karsten Loesing committed
70
In this case the resulting `tor` binary can be found in `~/tor/src/app/tor` and needs to be passed to OnionPerf's `--tor` parameter when doing measurements.
71

Karsten Loesing's avatar
Karsten Loesing committed
72
### TGen
73

74
OnionPerf uses TGen to generate traffic on client and server side for its measurements. Installing dependencies, cloning TGen to a subdirectory in the user's home directory, and building TGen is done as follows:
Ana Custura's avatar
Ana Custura committed
75

Karsten Loesing's avatar
Karsten Loesing committed
76 77 78 79 80 81 82 83 84
```shell
sudo apt install cmake libglib2.0-dev libigraph0-dev make
cd ~/
git clone https://github.com/shadow/tgen.git
cd tgen/
mkdir build
cd build/
cmake ..
make
85 86
```

87
The TGen binary will be contained in `~/tgen/build/src/tgen`, which is also the path that needs to be passed to OnionPerf's `--tgen` parameter when doing measurements.
88

Karsten Loesing's avatar
Karsten Loesing committed
89
### OnionPerf
Rob Jansen's avatar
Rob Jansen committed
90

Karsten Loesing's avatar
Karsten Loesing committed
91
OnionPerf is written in Python 3. The following instructions assume that a Python virtual environment is being used, even though installation is also possible without that.
Rob Jansen's avatar
Rob Jansen committed
92

Ana Custura's avatar
Ana Custura committed
93
The virtual environment is created, activated, and tested using:
94

Karsten Loesing's avatar
Karsten Loesing committed
95 96 97 98 99 100
```shell
sudo apt install python3-venv
cd ~/
python3 -m venv venv
source venv/bin/activate
which python3
101
```
Karsten Loesing's avatar
Karsten Loesing committed
102 103 104

The last command should output something like `~/venv/bin/python3` as the path to the `python3` binary used in the virtual environment.

Ana Custura's avatar
Ana Custura committed
105
The next step is to clone the OnionPerf repository and install its requirements:
Karsten Loesing's avatar
Karsten Loesing committed
106 107 108 109

```shell
git clone https://git.torproject.org/onionperf.git
pip3 install --no-cache -r onionperf/requirements.txt
110 111
```

Ana Custura's avatar
Ana Custura committed
112
The final step is to install OnionPerf and print out the usage information to see if the installation was successful:
113

Karsten Loesing's avatar
Karsten Loesing committed
114 115 116 117 118
```shell
cd onionperf/
python3 setup.py install
cd ~/
onionperf --help
119
```
Karsten Loesing's avatar
Karsten Loesing committed
120 121 122 123

The virtual environment is deactivated with the following command:

```shell
124 125 126
deactivate
```

Ana Custura's avatar
Ana Custura committed
127
However, in order to perform measurements or analyses, the virtual environment needs to be activated first. This will ensure all the paths are found.
128

129 130 131 132 133 134
If needed, unit tests are run with the following command:

```shell
cd ~/onionperf/
python3 -m nose --with-coverage --cover-package=onionperf
```
135

Karsten Loesing's avatar
Karsten Loesing committed
136
## Measurement
Rob Jansen's avatar
Rob Jansen committed
137

Karsten Loesing's avatar
Karsten Loesing committed
138
Performing measurements with OnionPerf is done by starting an `onionperf` process that itself starts several other processes and keeps running until it is interrupted by the user. During this time it performs new measurements every 5 minutes and logs measurement results to files.
139

Karsten Loesing's avatar
Karsten Loesing committed
140
Ideally, OnionPerf is run detached from the terminal session using tmux, systemd, or similar, except for the most simple test runs. The specifics for using these tools are not covered in this document.
Ana Custura's avatar
Ana Custura committed
141

Karsten Loesing's avatar
Karsten Loesing committed
142
### Starting and stopping measurements
Rob Jansen's avatar
Rob Jansen committed
143

Ana Custura's avatar
Ana Custura committed
144
The most trivial configuration is to measure onion services only. In that case, OnionPerf runs without needing any additional configuration. For direct measurements via exit nodes, firewall rules or port forwarding may be required to allow inbound connections to the TGen server.
145

Karsten Loesing's avatar
Karsten Loesing committed
146 147 148 149 150
Starting these measurements is as simple as:

```shell
cd ~/
onionperf measure --onion-only --tgen ~/tgen/build/tgen --tor ~/tor/src/app/tor
151 152
```

Karsten Loesing's avatar
Karsten Loesing committed
153
OnionPerf logs its main output on the console and then waits indefinitely until the user presses `CTRL-C` for graceful shutdown. It does not, however, print out measurement results or progress on the console, just a heartbeat message every hour.
154

Karsten Loesing's avatar
Karsten Loesing committed
155
OnionPerf's `measure` mode has several command-line parameters for customizing measurements. See the following command for usage information:
156

Karsten Loesing's avatar
Karsten Loesing committed
157 158
```shell
onionperf measure --help
159 160
```

Karsten Loesing's avatar
Karsten Loesing committed
161
### Output directories and files
162

Karsten Loesing's avatar
Karsten Loesing committed
163
OnionPerf writes several files to two subdirectories in the current working directory while doing measurements:
164

Karsten Loesing's avatar
Karsten Loesing committed
165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180
- `onionperf-data/` is the main directory containing measurement results.
  - `htdocs/` is created at the first UTC midnight after starting and contains measurement analysis result files that can be shared via a local web server.
    - `$date.onionperf.analysis.json.xz` contains extracted metrics in OnionPerf's analysis JSON format.
    - `index.xml` contains a directory index with file names, sizes, last-modified times, and SHA-256 digests.
  - `tgen-client/` is the working directory of the client-side `tgen` process.
    - `log_archive/` is created at the first UTC midnight after starting and contains compressed log files from previous UTC days.
    - `onionperf.tgen.log` is the current log file.
    - `tgen.graphml.xml` is the traffic model file generated by OnionPerf and used by TGen.
  - `tgen-server/` is the working directory of the server-side `tgen` process with the same structure as `tgen-client/`.
  - `tor-client/` is the working directory of the client-side `tor` process.
    - `log_archive/` is created at the first UTC midnight after starting and contains compressed log files from previous UTC days.
    - `onionperf.tor.log` is the current log file containing log messages by the client-side `tor` process.
    - `onionperf.torctl.log` is the current log file containing controller events obtained by OnionPerf connecting to the control port of the client-side `tor` process.
    - `[...]` (several other files written by the client-side `tor` process to its data directory)
  - `tor-server/` is the working directory of the server-side `tor` process with the same structure as `tor-client/`.
- `onionperf-private/` contains private keys of the onion services used for measurements and potentially other files that are not meant to be published together with measurement results.
181

Karsten Loesing's avatar
Karsten Loesing committed
182
### Changing Tor configurations
183

Karsten Loesing's avatar
Karsten Loesing committed
184
OnionPerf generates Tor configurations for both client-side and server-side `tor` processes. There are a few ways to add Tor configuration lines:
185

Karsten Loesing's avatar
Karsten Loesing committed
186
- If the `BASETORRC` environment variable is set, OnionPerf appends its own configuration options to the contents of that variable. Example:
187

Karsten Loesing's avatar
Karsten Loesing committed
188 189 190
  ```shell
  BASETORRC=$'Option1 Foo\nOption2 Bar\n' onionperf ...
  ```
191

Karsten Loesing's avatar
Karsten Loesing committed
192 193
- If the `--torclient-conf-file`  and/or  `--torserver-conf-file`  command-line arguments are given, the contents of those files are appended to the configurations of client-side and/or server-side `tor` process.
- If the `--additional-client-conf` command-line argument is given, its content is appended to the configuration of the client-side  `tor`  process.
194

Karsten Loesing's avatar
Karsten Loesing committed
195 196 197 198
These options can be used, for example, to change the default measurement setup use bridges (or pluggable transports) by passing bridge addresses as additional client configuration lines as follows:

```shell
onionperf measure --additional-client-conf="UseBridges 1\nBridge 72.14.177.231:9001 AC0AD4107545D4AF2A595BC586255DEA70AF119D\nBridge 195.91.239.8:9001 BA83F62551545655BBEBBFF353A45438D73FD45A\nBridge 148.63.111.136:35577 768C8F8313FF9FF8BBC915898343BC8B238F3770"
199 200
```

Karsten Loesing's avatar
Karsten Loesing committed
201
### Changing the TGen traffic model
202

Karsten Loesing's avatar
Karsten Loesing committed
203
OnionPerf is a relatively simple tool that can be adapted to do more complex measurements beyond what can be configured on the command line.
204

Ana Custura's avatar
Ana Custura committed
205 206 207
For example, the hard-coded traffic model generated by OnionPerf and executed by the TGen processes is to send a small request from client to server and receive a relatively large response of 5 MiB of random data back. This model can be changed by editing `~/onionperf/onionperf/model.py`, rebuilding, and restarting measurements. For specifics, see the [TGen
documentation](https://github.com/shadow/tgen/blob/master/doc/TGen-Overview.md)
and [TGen traffic model examples](https://github.com/shadow/tgen/blob/master/tools/scripts/generate_tgen_config.py).
208

Karsten Loesing's avatar
Karsten Loesing committed
209
### Sharing measurement results
210

Ana Custura's avatar
Ana Custura committed
211
Measurement results can be further analyzed and visualized on the measuring host. But in many cases it's more convenient to do analysis and visualization on another host, also to compare measurements from different hosts to each other.
212

Karsten Loesing's avatar
Karsten Loesing committed
213
There are at least two common ways of sharing measurement results:
214

Karsten Loesing's avatar
Karsten Loesing committed
215 216
1. Creating a tarball of the `onionperf-data/` directory; and
2. Using a local web server to serve the contents of the `onionperf-data/` directory.
217

Karsten Loesing's avatar
Karsten Loesing committed
218
The details of doing either of these two methods are not covered in this document.
219

Karsten Loesing's avatar
Karsten Loesing committed
220
### Troubleshooting
221

Karsten Loesing's avatar
Karsten Loesing committed
222
If anything goes wrong while doing measurements, OnionPerf typically informs the user in its console output. This is also the first place to look for investigating any issues.
223

Karsten Loesing's avatar
Karsten Loesing committed
224
The second place would be to check the log files in `~/onionperf-data/tgen-client/` or `~/onionperf-data/tor-client/`.
225

Karsten Loesing's avatar
Karsten Loesing committed
226
The most common configuration problems are probably related to firewall and port forwarding for doing direct (non onion-service) measurements. The specifics for setting up the firewall are out of scope for this document.
227

Karsten Loesing's avatar
Karsten Loesing committed
228
Another class of common issues of long-running measurements is that one of the `tgen` or `tor` processes dies for reasons or hints (hopefully) to be found in their respective log files.
229

Karsten Loesing's avatar
Karsten Loesing committed
230
In order to avoid extended downtimes it is recommended to deploy monitoring tools that check whether measurement results produced by OnionPerf are fresh. The specifics are, again, out of scope for this document.
231

Karsten Loesing's avatar
Karsten Loesing committed
232
## Analysis
233

Karsten Loesing's avatar
Karsten Loesing committed
234
The next steps after performing measurements are to analyze and optionally visualize measurement results.
235

Karsten Loesing's avatar
Karsten Loesing committed
236
### Analyzing measurement results
237

Karsten Loesing's avatar
Karsten Loesing committed
238
While performing measurements, OnionPerf writes quite verbose log files to disk. The first step in the analysis is to parse these log files, extract key metrics, and write smaller and more structured measurement results to disk. This is done with OnionPerf's `analyze` mode.
239

Karsten Loesing's avatar
Karsten Loesing committed
240 241 242 243
For example, the following command analyzes current log files of a running (or stopped) OnionPerf instance (as opposed to log-rotated, compressed files from previous days):

```shell
onionperf analyze --tgen ~/onionperf-data/tgen-client/onionperf.tgen.log --torctl ~/onionperf-data/tor-client/onionperf.torctl.log
244
```
245

246
The output analysis file is written to `onionperf.analysis.json.xz` in the current working directory. The file format is described in more detail in `schema/onionperf-3.0.json`.
247

Karsten Loesing's avatar
Karsten Loesing committed
248 249 250
The same analysis files are written automatically as part of ongoing measurements once per day at UTC midnight and can be found in `onionperf-data/htdocs/`.

OnionPerf's `analyze` mode has several command-line parameters for customizing the analysis step:
251

Karsten Loesing's avatar
Karsten Loesing committed
252 253 254
```shell
onionperf analyze --help
```
Ana Custura's avatar
Ana Custura committed
255 256
### Filtering measurement results

Ana Custura's avatar
Ana Custura committed
257
The `filter` subcommand is typically used in combination with the `visualize` subcommand. The workflow is to filter out any Tor streams/circuits that are not desired then visualize only those measurements with an existing mapping between TGen transfers/streams and Tor streams/circuits.
Ana Custura's avatar
Ana Custura committed
258

Ana Custura's avatar
Ana Custura committed
259 260 261 262
Currently, OnionPerf measurement results can be filtered based on Tor relay fingerprints, although support for filtering TGen based on transfers/streams may be added in the future.

The `filter` mode takes a list of fingerprints and one or more existing analysis files as inputs, and outputs new analysis files which include, unchanged, the Tor results obtained over a Tor circuit path which includes or excludes fingerprints in the input list. All other Tor results are also included in the file, but are marked as 'filtered\_out'.
Filter metadata detailing the filter type and path to the input list used is also included in the analysis file.
Ana Custura's avatar
Ana Custura committed
263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278

For example, the analysis file produced above can be filtered with the following command, which retains measurements based on fingerprints contained in the file 'fingerprints.txt':

```shell
onionperf filter -i onionperf.analysis.json.xz -o filtered.onionperf.analysis.json.xz --include-fingerprints fingerprints.txt
```

The output analysis file is written to the path specified with `-o`. If processing a directory of analysis files, its structure and filenames are preserved under the path specified with '-o'.
Note that while the subcommand filters `tgen` measurements, it leaves `tgen` and `tor` summaries in the original analysis file unchanged.

OnionPerf's `filter` command usage can be inspected with:

```shell
onionperf filter --help
```

279

Karsten Loesing's avatar
Karsten Loesing committed
280
### Visualizing measurement results
Hiro's avatar
Hiro committed
281

Karsten Loesing's avatar
Karsten Loesing committed
282
Step two in the analysis is to process analysis files with OnionPerf's `visualize` mode which produces CSV and PDF files as output.
Hiro's avatar
Hiro committed
283

Karsten Loesing's avatar
Karsten Loesing committed
284
For example, the analysis file produced above can be visualized with the following command, using "Test Measurements" as label for the data set:
Hiro's avatar
Hiro committed
285

Karsten Loesing's avatar
Karsten Loesing committed
286 287 288
```shell
onionperf visualize --data onionperf.analysis.json.xz "Test Measurements"
```
Hiro's avatar
Hiro committed
289

Karsten Loesing's avatar
Karsten Loesing committed
290
As a result, two files are written to the current working directory:
291

Karsten Loesing's avatar
Karsten Loesing committed
292 293
- `onionperf.viz.$datetime.csv` contains visualized data in a CSV file format; and
- `onionperf.viz.$datetime.pdf` contains visualizations in a PDF file format.
294

Ana Custura's avatar
Ana Custura committed
295 296
For analysis files containing tor circuit filters, only measurements with an existing mapping between TGen transfers/streams Tor streams/circuits which have not been marked as 'filtered\_out' are visualized.

Karsten Loesing's avatar
Karsten Loesing committed
297
Similar to the other modes, OnionPerf's `visualize` mode has command-line parameters for customizing the visualization step:
298

Karsten Loesing's avatar
Karsten Loesing committed
299 300
```shell
onionperf visualize --help
301 302
```

Karsten Loesing's avatar
Karsten Loesing committed
303
### Interpreting the PDF output format
304

Karsten Loesing's avatar
Karsten Loesing committed
305
The PDF output file contains visualizations of the following metrics:
306

Karsten Loesing's avatar
Karsten Loesing committed
307 308 309 310 311 312 313 314 315
- Time to download first (last) byte, which is defined as elapsed time between starting a measurement and receiving the first (last) byte of the HTTP response.
- Throughput, which is computed from the elapsed time between receiving 0.5 and 1 MiB of the response.
- Number of downloads.
- Number and type of failures.

### Interpreting the CSV output format

The CSV output file contains the same data that is visualized in the PDF file. It contains the following columns:

316
- `id` is the identifier used in the TGen client logs which may be useful to look up more details about a specific measurement.
Karsten Loesing's avatar
Karsten Loesing committed
317 318 319 320 321 322 323 324
- `error_code`  is an optional error code if a measurement did not succeed.
- `filesize_bytes` is the requested file size in bytes.
- `label` is the data set label as given in the `--data/-d` parameter to the `visualize` mode.
- `server` is set to either `onion` for onion service measurements or `public` for direct measurements.
- `start` is the measurement start time.
- `time_to_first_byte` is the time in seconds (with microsecond precision) to download the first byte.
- `time_to_last_byte` is the time in seconds (with microsecond precision) to download the last byte.

325 326 327 328 329 330
### Visualizations on Tor Metrics

The analysis and visualization steps above can all be done by using the OnionPerf tool. In addition to that it's possible to visualize OnionPerf analysis files using other tools.

For example, the [Tor Metrics website](https://metrics.torproject.org/torperf.html) contains various graphs based OnionPerf data.

Karsten Loesing's avatar
Karsten Loesing committed
331
## Contributing
332

Karsten Loesing's avatar
Karsten Loesing committed
333
The OnionPerf code is developed at https://gitlab.torproject.org/tpo/metrics/onionperf.
334

Karsten Loesing's avatar
Karsten Loesing committed
335
Contributions to OnionPerf are welcome and encouraged!
Rob Jansen's avatar
Rob Jansen committed
336