A caching service is a set of reverse proxies keeping a smaller cache of content in memory to speed up access to resources on a slower backend web server.
WARNING: This service was retired in early 2022 and this documentation is now outdated. It is kept for historical purposes.
- Tutorial
- How-to
- Reference
- Discussion
Tutorial
To inspect the current cache hit ratio, head over to the cache health dashboard in howto/grafana. It should be at least 75% and generally over or close to 90%.
How-to
Traffic inspection
A quick way to see how much traffic is flowing through the cache is to
fire up slurm on the public interface of the caching server
(currently cache01
and cache-02
):
slurm -i eth0
This will display a realtime graphic of the traffic going in and out of the server. It should be below 1Gbit/s (or around 120MB/s).
Another way to see throughput is to use iftop, in a similar way:
iftop -i eth0 -n
This will show per host traffic statistics, which might allow
pinpointing possible abusers. Hit the L
key to turn on the
logarithmic scale, without which the display quickly becomes
unreadable.
Log files are in /var/log/nginx
(although those might eventually go
away, see ticket #32461). The lnav program can be used to
show those log files in a pretty way and do extensive queries on
them. Hit the i
button to flip to the "histogram" view and z
multiple times to zoom all the way into a per-second hit rate
view. Hit q
to go back to the normal view, which is useful to
inspect individual hits and diagnose why they fail to be cached, for
example.
Immediate hit ratio can be extracted from lnav
thanks to our custom
log parser shipped through Puppet. Load the log file in lnav:
lnav /var/log/nginx/ssl.blog.torproject.org.access.log
then hit ;
to enter the SQL query mode and issue this query:
SELECT count(*), upstream_cache_status FROM logline WHERE status_code < 300 GROUP BY upstream_cache_status;
See also howto/logging for more information about lnav.
Pager playbook
The only monitoring for this service is to ensure the proper number of nginx processes are running. If this gets triggered, the fix might be to just restart nginx:
service nginx restart
... although it might be a sign of a deeper issue requiring further traffic inspection.
Disaster recovery
In case of fire, head to the torproject.org
zone in the
dns/domains
and flip the DNS record of the affected service back to
the backend. See ticket #32239 for details on that.
TODO: disaster recovery could be improved. How to deal with DDOS? Memory, disk exhaustion? Performance issues?
Reference
Installation
Include roles::cache
in Puppet.
TODO: document how to add new sites in the cache. See ticket#32462 for that project.
SLA
Service should generally stay online as much as possible, because it fronts critical web sites for the Tor project, but otherwise shouldn't especially differ from other SLA.
Hit ratio should be high enough to reduce costs significantly on the backend.
Design
The cache service generally constitutes of two or more servers in geographically distinct areas that run a webserver acting as a reverse proxy. In our case, we run the Nginx webserver with the proxy module for the https://blog.torproject.org/ website (and eventually others, see ticket #32462). One server is in the howto/ganeti cluster, and another is a VM in the Hetzner Cloud (2.50EUR/mth).
DNS for the site points to cache.torproject.org
, an alias for the
caching servers, which are currently two: cache01.torproject.org
[sic] and cache-02
. An HTTPS certificate for the site was
issued through howto/letsencrypt. Like the Nginx configuration, the
certificate is deployed by Puppet in the roles::cache
class.
When a user hits the cache server, content is served from the cache
stored in /var/cache/nginx
, with a filename derived from the
proxy_cache_key
and proxy_cache_path
settings. Those
files should end up being cached by the kernel in virtual memory,
which should make those accesses fast. If the cache is present and
valid, it is returned directly to the user. If it is missing or
invalid, it is fetched from the backend immediately. The backend is
configured in Puppet as well.
Requests to the cache are logged to the disk in
/var/log/nginx/ssl.$hostname.access.log
, with IP address and user
agent removed. Then mtail parses those log files and increments
various counters and exposes those as metrics that are then scraped by
howto/prometheus. We use howto/grafana to display that hit ratio which, at
the time of writing, is about 88% for the blog.
Puppet architecture
Because the Puppet code isn't public yet (ticket #29387, here's a quick overview of how we set things up for others to follow.
The entry point in Puppet is the roles::cache
class, which
configures an "Nginx server" (like an Apache vhost) to do the caching
of the backend. It also includes our common Nginx configuration in
profile::nginx
which in turns delegates most of the configuration to
the Voxpupuli Nginx Module.
The role is essentially consists of:
include profile::nginx
nginx::resource::server { 'blog.torproject.org':
ssl_cert => '/etc/ssl/torproject/certs/blog.torproject.org.crt-chained',
ssl_key => '/etc/ssl/private/blog.torproject.org.key',
proxy => 'https://live-tor-blog-8.pantheonsite.io',
# no servicable parts below
ipv6_enable => true,
ipv6_listen_options => '',
ssl => true,
# part of HSTS configuration, the other bit is in add_header below
ssl_redirect => true,
# proxy configuration
#
# pass the Host header to the backend (otherwise the proxy URL above is used)
proxy_set_header => ['Host $host'],
# should map to a cache zone defined in the nginx profile
proxy_cache => 'default',
# start caching redirects and 404s. this code is taken from the
# upstream documentation in
# https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_valid
proxy_cache_valid => [
'200 302 10m',
'301 1h',
'any 1m',
],
# allow serving stale content on error, timeout, or refresh
proxy_cache_use_stale => 'error timeout updating',
# allow only first request through backend
proxy_cache_lock => 'on',
# purge headers from backend we will override. X-Served-By and Via
# are merged into the Via header, as per rfc7230 section 5.7.1
proxy_hide_header => ['Strict-Transport-Security', 'Via', 'X-Served-By'],
add_header => {
# this is a rough equivalent to Varnish's Age header: it caches
# when the page was cached, instead of its age
'X-Cache-Date' => '$upstream_http_date',
# if this was served from cache
'X-Cache-Status' => '$upstream_cache_status',
# replace the Via header with ours
'Via' => '$server_protocol $server_name',
# cargo-culted from Apache's configuration
'Strict-Transport-Security' => 'max-age=15768000; preload',
},
# cache 304 not modified entries
raw_append => "proxy_cache_revalidate on;\n",
# caches shouldn't log, because it is too slow
#access_log => 'off',
format_log => 'cacheprivacy',
}
There are also firewall (to open the monitoring, HTTP and HTTPS ports) and mtail (to read the log fiels for hit ratios) configurations but those are not essential to get Nginx itself working.
The profile::nginx
class is our common Nginx configuration that also
covers non-caching setups:
# common nginx configuration
#
# @param client_max_body_size max upload size on this server. upstream
# default is 1m, see:
# https://nginx.org/en/docs/http/ngx_http_core_module.html#client_max_body_size
class profile::nginx(
Optional[String] $client_max_body_size = '1m',
) {
include webserver
class { 'nginx':
confd_purge => true,
server_purge => true,
manage_repo => false,
http2 => 'on',
server_tokens => 'off',
package_flavor => 'light',
log_format => {
# built-in, according to: http://nginx.org/en/docs/http/ngx_http_log_module.html#log_format
# 'combined' => '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent"'
# "privacy" censors the client IP address from logs, taken from
# the Apache config, minus the "day" granularity because of
# limitations in nginx. we remove the IP address and user agent
# but keep the original request time, in other words.
'privacy' => '0.0.0.0 - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "-"',
# the "cache" formats adds information about the backend, namely:
# upstream_addr - address and port of upstream server (string)
# upstream_response_time - total time spent talking to the backend server, in seconds (float)
# upstream_cache_status - state fo the cache (MISS, HIT, UPDATING, etc)
# request_time - total time spent answering this query, in seconds (float)
'cache' => '$server_name:$server_port $remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" $upstream_addr $upstream_response_time $upstream_cache_status $request_time', #lint:ignore:140chars
'cacheprivacy' => '$server_name:$server_port 0.0.0.0 - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "-" $upstream_addr $upstream_response_time $upstream_cache_status $request_time', #lint:ignore:140chars
},
# XXX: doesn't work because a default is specified in the
# class. doesn't matter much because the puppet module reuses
# upstream default.
worker_rlimit_nofile => undef,
accept_mutex => 'off',
# XXX: doesn't work because a default is specified in the
# class. but that doesn't matter because accept_mutex is off so
# this has no effect
accept_mutex_delay => undef,
http_tcp_nopush => 'on',
gzip => 'on',
client_max_body_size => $client_max_body_size,
run_dir => '/run/nginx',
client_body_temp_path => '/run/nginx/client_body_temp',
proxy_temp_path => '/run/nginx/proxy_temp',
proxy_connect_timeout => '60s',
proxy_read_timeout => '60s',
proxy_send_timeout => '60s',
proxy_cache_path => '/var/cache/nginx/',
proxy_cache_levels => '1:2',
proxy_cache_keys_zone => 'default:10m',
# XXX: hardcoded, should just let nginx figure it out
proxy_cache_max_size => '15g',
proxy_cache_inactive => '24h',
ssl_protocols => 'TLSv1 TLSv1.1 TLSv1.2 TLSv1.3',
# XXX: from the apache module see also https://bugs.torproject.org/32351
ssl_ciphers => 'ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS', # lint:ignore:140chars
}
# recreate the default vhost
nginx::resource::server { 'default':
server_name => ['_'],
www_root => "/srv/www/${webserver::defaultpage::defaultdomain}/htdocs/",
listen_options => 'default_server',
ipv6_enable => true,
ipv6_listen_options => 'default_server',
# XXX: until we have an anonymous log format
access_log => 'off',
ssl => true,
ssl_redirect => true,
ssl_cert => '/etc/ssl/torproject-auto/servercerts/thishost.crt',
ssl_key => '/etc/ssl/torproject-auto/serverkeys/thishost.key';
}
}
There are lots of config settings there, but they are provided to reduce the diff between the upstream debian package and the Nginx module from the forge. This was filed upstream as a bug.
Issues
Only serious issues, or issues that are not in the cache component but still relevant to the service, are listed here:
- the cipher suite is an old hardcoded copy derived from Apache, see ticket #32351
- the Nginx puppet module diverges needlessly from upstream and Debian package configuration see puppet-nginx-1359
The service was launched as part of improvements to the blog infrastructure, in ticket #32090. The launch checklist and progress was tricket in ticket #32239.
File or search for issues in the services - cache component.
Monitoring and testing
The caching servers are monitored like other servers by the Nagios server. The Nginx cache manager and the blog endpoint are also monitored for availability.
Logs and metrics
Nginx logs are currently kept in a way that violates typical policy (tpo/tpa/team#32461). They do not contain IP addresses, but do contain accurate time records (granularity to the second) which might be exploited for correlation attacks.
Nginx logs are fed into mtail
to extract hit rate information, which
is exported to Prometheus, which, in turn, is used to create a
Grafana dashboard which shows request and hit rates on the
caching servers.
Other documentation
- NGINX Alphabetical index of variables
- NGINX Module ngx_http_proxy_module
- NGINX Content Caching
- NGINX Reverse Proxy
- perusio@github.com: Nginx configuration for running Drupal - interesting snippet for cookies handling, not required
- NGINX: Maximizing Drupal 8 Performance with NGINX, Part 2: Caching and Load Balancing
Discussion
This section regroups notes that were gathered during the research, configuration, and deployment of the service. That includes goals, cost, benchmarks and configuration samples.
Launch was done in the first week of November 2019 as part of ticket#32239, to front the https://blog.torproject.org/ site.
Overview
The original goal of this project is to create a pair of caching servers in front of the blog to reduce the bandwidth costs we're being charged there.
Goals
Must have
- reduce the traffic on the blog, hosted at a costly provider (#32090 (closed))
- HTTPS support in the frontend and backend
- deployment through Puppet
- anonymized logs
- hit rate stats
Nice to have
- provide a frontend for our existing mirror infrastructure, a home-made CDN for TBB and other releases
- no on-disk logs
- cute dashboard or grafana integration
- well-maintained upstream Puppet module
Approvals required
- approved and requested by vegas
Non-Goals
- global CDN for users outside of TPO
- geoDNS
Cost
Somewhere between 11EUR and 100EUR/mth for bandwidth and hardware.
We're getting apparently around 2.2M "page views" per month at Pantheon. That is about 1 hit per second and 12 terabyte per month, 36Mbit/s on average:
$ qalc
> 2 200 000 ∕ (30d) to hertz
2200000 / (30 * day) = approx. 0.84876543 Hz
> 2 200 000 * 5Mibyte
2200000 * (5 * mebibyte) = 11.534336 terabytes
> 2 200 000 * 5Mibyte/(30d) to megabit / s
(2200000 * (5 * mebibyte)) / (30 * day) = approx. 35.599802 megabits / s
Hetzner charges 1EUR/TB/month over our 1TB quota, so bandwidth would cost 11EUR/month on average. If costs become prohibitive, we could switch to a Hetzner VM which includes 20TB of traffic per month at costs ranging from 3EUR/mth to 30EUR/mth depending on the VPS size (between 1 vCPU, 2GB ram, 20GB SSD and 8vCPU, 32GB ram and 240GB SSD).
Dedicated servers start at 34EUR/mth (EX42
, 64GB ram 2x4TB HDD) for
unlimited gigabit.
We first go with a virtual machine in the howto/ganeti cluster and also a VM in Hetzner Cloud (2.50EUR/mth).
Proposed Solution
Nginx will be deployed on two servers. ATS was found to be somewhat difficult to configure and debug, while Nginx has a more "regular" configuration file format. Furthermore, performance was equivalent or better in Nginx.
Finally, there is the possibility of converging all HTTP services towards Nginx if desired, which would reduce the number of moving parts in the infrastructure.
Benchmark results overview
Hits per second:
Server | AB | Siege | Bombardier | B. HTTP/1 |
---|---|---|---|---|
Upstream | n/a | n/a | 2800 | n/a |
ATS, local | 800 | 569 | n/a | n/a |
ATS, remote | 249 | 241 | 2050 | 1322 |
Nginx | 324 | 269 | 2117 | n/a |
Throughput (megabyte/s):
Server | AB | Siege | Bombardier | B. HTTP/1 |
---|---|---|---|---|
Upstream | n/a | n/a | 145 | n/a |
ATS, local | 42 | 5 | n/a | n/a |
ATS, remote | 13 | 2 | 105 | 14 |
Nginx | 17 | 14 | 107 | n/a |
Launch checklist
See #32239 for a followup on the launch procedure.
Benchmarking procedures
See the benchmark procedures.
Baseline benchmark
Baseline benchmark of the actual blog site, from cache02
:
anarcat@cache-02:~$ ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ -c 100
Bombarding https://blog.torproject.org:443/ for 2m0s using 100 connection(s)
[================================================================================================================================================================] 2m0s
Done!
Statistics Avg Stdev Max
Reqs/sec 2796.01 716.69 6891.48
Latency 35.96ms 22.59ms 1.02s
Latency Distribution
50% 33.07ms
75% 40.06ms
90% 47.91ms
95% 54.66ms
99% 75.69ms
HTTP codes:
1xx - 0, 2xx - 333646, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 144.79MB/s
This is strangely much higher, in terms of throughput, and faster, in terms of latency, than testing against our own servers. Different avenues were explored to explain that disparity with our servers:
- jumbo frames? nope, both connexions see packets larger than 1500 bytes
- protocol differences? nope, both go over IPv6 and (probably) HTTP/2 (at least not over UDP)
- different link speeds
The last theory is currently the only one standing. Indeed, 144.79MB/s should not be possible on regular gigabit ethernet (GigE), as it is actually more than 1000Mbit/s (1158.32Mbit/s). Sometimes the above benchmark even gives 152MB/s (1222Mbit/s), way beyond what a regular GigE link should be able to provide.
Alternatives considered
Four alternatives were seriously considered:
- Apache Traffic Server
- Nginx proxying + caching
- Varnish + stunnel
- Fastly
Other alternatives were not:
- Apache HTTPD caching - performance expected to be sub-par
- Envoy - not designed for caching, external cache support planned in 2019
- HAproxy - not designed to cache large objects
- H2O - HTTP/[123], written from scratch for HTTP/2+, presumably faster than Nginx, didn't find out about it until after the project launched
- Ledge - caching extension to Nginx with ESI, Redis, and cache purge support, not packaged in Debian
- Nuster - new project, not packaged in Debian (based on HAproxy), performance comparable with nginx and varnish according to upstream, although impressive improvements
- Polipo - not designed for production use
- Squid - not designed as a reverse proxy
- Traefik - not designed for caching
Apache Traffic Server
Summary of online reviews
Pros:
- HTTPS
- HTTP/2
- industry leader (behind cloudflare)
- out of the box clustering support
Cons:
- load balancing is an experimental plugin (at least in 2016)
- no static file serving? or slower?
- no commercial support
Used by Yahoo, Apple and Comcast.
First impressions
Pros:
- Puppet module available
- no query logging by default (good?)
- good documentation, but a bit lacking in tutorials
- nice little dashboard shipped by default (
traffic_top
) although it could be more useful (doesn't seem to show hit ratio clearly)
Cons:
- configuration spread out over many different configuration file
- complex and arcane configuration language (e.g. try to guess what
this actually does::
CONFIG proxy.config.http.server_ports STRING 8080:ipv6:tr-full 443:ssl ip-in=192.168.17.1:80:ip-out=[fc01:10:10:1::1]:ip-out=10.10.10.1
) - configuration syntax varies across config files and plugins
-
couldn't decouple backend hostname and passedbad random tutorial found on the internetHost
header - couldn't figure out how to make HTTP/2 work
- no prometheus exporters
Configuration
apt install trafficserver
Default Debian config seems sane when compared to the Cicimov tutorial. On thing we will need to change is the default listening port, which is by default:
CONFIG proxy.config.http.server_ports STRING 8080 8080:ipv6
We want something more like this:
CONFIG proxy.config.http.server_ports STRING 80 80:ipv6 443:ssl 443:ssl:ipv6
We also need to tell ATS to keep the original Host header:
CONFIG proxy.config.url_remap.pristine_host_hdr INT 1
It's clearly stated in the tutorial, but mistakenly in Cicimov's.
Then we also need to configure the path to the SSL certs, we use the self-signed certs for benchmarking:
CONFIG proxy.config.ssl.server.cert.path STRING /etc/ssl/torproject-auto/servercerts/
CONFIG proxy.config.ssl.server.private_key.path STRING /etc/ssl/torproject-auto/serverkeys/
When we have a real cert created in let's encrypt, we can use:
CONFIG proxy.config.ssl.server.cert.path STRING /etc/ssl/torproject/certs/
CONFIG proxy.config.ssl.server.private_key.path STRING /etc/ssl/private/
Either way, we need to tell ATS about those certs:
#dest_ip=* ssl_cert_name=thishost.crt ssl_key_name=thishost.key
ssl_cert_name=blog.torproject.org.crt ssl_key_name=blog.torproject.org.key
We need to add trafficserver to the ssl-cert
group so it can read
those:
adduser trafficserver ssl-cert
Then we setup this remapping rule:
map https://blog.torproject.org/ https://backend.example.com/
(backend.example.com
is the prod alias of our backend.)
And finally curl is able to talk to the proxy:
curl --proxy-cacert /etc/ssl/torproject-auto/servercerts/ca.crt --proxy https://cache01.torproject.org/ https://blog.torproject.org
Troubleshooting
Proxy fails to hit backend
curl: (56) Received HTTP code 404 from proxy after CONNECT
Same with plain GET
:
# curl -s -k -I --resolve *:443:127.0.0.1 https://blog.torproject.org | head -1
HTTP/1.1 404 Not Found on Accelerator
It seems that the backend needs to respond on the right-side of the
remap rule correctly, as ATS doesn't reuse the Host
header
correctly, which is kind of a problem because the backend wants to
redirect everything to the canonical hostname for SEO purposes. We
could tweak that and make backend.example.com
the canonical host,
but then it would make disaster recovery much harder, and could make
some links point there instead of the real canonical host.
I tried the mysterious regex_remap plugin:
map http://cache01.torproject.org/ http://localhost:8000/ @plugin=regex_remap.so @pparam=maps.reg @pparam=host
with this in maps.reg
:
.* $s://$f/$P/
... which basically means "redirect everything to the original scheme, host and path", but that (obviously, maybe) fails with:
# curl -I -s http://cache01.torproject.org/ | head -1
HTTP/1.1 400 Multi-Hop Cycle Detected
It feels it really doesn't want to act as a transparent proxy...
I also tried a header rewrite:
map http://cache01.torproject.org/ http://localhost:8000/ @plugin=header_rewrite.so @pparam=rules1.conf
with rules1.conf
like:
set-header host cache01.torproject.org
set-header foo bar
... and the Host
header is untouched. The rule works though because
the Foo
header appears in the request.
The solution to this is the proxy.config.url_remap.pristine_host_hdr
documented above.
HTTP/2 support missing
Next hurdle: no HTTP/2 support, even when using proto=http2;http
(falls back on HTTP/1.1
) and proto=http2
only (fails with
WARNING: Unregistered protocol type 0
).
Benchmarks
Same host tests
With blog.tpo
in /etc/hosts
, because proxy-host
doesn't work, and
running on the same host as the proxy (!), cold cache:
root@cache01:~# siege https://blog.torproject.org/
** SIEGE 4.0.4
** Preparing 100 concurrent users for battle.
The server is now under siege...
Lifting the server siege...
Transactions: 68068 hits
Availability: 100.00 %
Elapsed time: 119.53 secs
Data transferred: 654.47 MB
Response time: 0.18 secs
Transaction rate: 569.46 trans/sec
Throughput: 5.48 MB/sec
Concurrency: 99.67
Successful transactions: 68068
Failed transactions: 0
Longest transaction: 0.56
Shortest transaction: 0.00
Warm cache:
root@cache01:~# siege https://blog.torproject.org/
** SIEGE 4.0.4
** Preparing 100 concurrent users for battle.
The server is now under siege...
Lifting the server siege...
Transactions: 65953 hits
Availability: 100.00 %
Elapsed time: 119.71 secs
Data transferred: 634.13 MB
Response time: 0.18 secs
Transaction rate: 550.94 trans/sec
Throughput: 5.30 MB/sec
Concurrency: 99.72
Successful transactions: 65953
Failed transactions: 0
Longest transaction: 0.62
Shortest transaction: 0.00
And traffic_top
looks like this after the second run:
CACHE INFORMATION CLIENT REQUEST & RESPONSE
Disk Used 77.8K Ram Hit 99.9% GET 98.7% 200 98.3%
Disk Total 268.1M Fresh 98.2% HEAD 0.0% 206 0.0%
Ram Used 16.5K Revalidate 0.0% POST 0.0% 301 0.0%
Ram Total 352.3K Cold 0.0% 2xx 98.3% 302 0.0%
Lookups 134.2K Changed 0.1% 3xx 0.0% 304 0.0%
Writes 13.0 Not Cache 0.0% 4xx 2.0% 404 0.4%
Updates 1.0 No Cache 0.0% 5xx 0.0% 502 0.0%
Deletes 0.0 Fresh (ms) 8.6M Conn Fail 0.0 100 B 0.1%
Read Activ 0.0 Reval (ms) 0.0 Other Err 2.8K 1 KB 2.0%
Writes Act 0.0 Cold (ms) 26.2G Abort 111.0 3 KB 0.0%
Update Act 0.0 Chang (ms) 11.0G 5 KB 0.0%
Entries 2.0 Not (ms) 0.0 10 KB 98.2%
Avg Size 38.9K No (ms) 0.0 1 MB 0.0%
DNS Lookup 156.0 DNS Hit 89.7% > 1 MB 0.0%
DNS Hits 140.0 DNS Entry 2.0
CLIENT ORIGIN SERVER
Requests 136.5K Head Bytes 151.6M Requests 152.0 Head Bytes 156.5K
Req/Conn 1.0 Body Bytes 1.4G Req/Conn 1.1 Body Bytes 1.1M
New Conn 137.0K Avg Size 11.0K New Conn 144.0 Avg Size 8.0K
Curr Conn 0.0 Net (bits) 12.0G Curr Conn 0.0 Net (bits) 9.8M
Active Con 0.0 Resp (ms) 1.2
Dynamic KA 0.0
cache01 (r)esponse (q)uit (h)elp (A)bsolute
ab:
# ab -c 100 -n 1000 https://blog.torproject.org/
[...]
Server Software: ATS/8.0.2
Server Hostname: blog.torproject.org
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256
Server Temp Key: X25519 253 bits
TLS Server Name: blog.torproject.org
Document Path: /
Document Length: 52873 bytes
Concurrency Level: 100
Time taken for tests: 1.248 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 53974000 bytes
HTML transferred: 52873000 bytes
Requests per second: 801.43 [#/sec] (mean)
Time per request: 124.776 [ms] (mean)
Time per request: 1.248 [ms] (mean, across all concurrent requests)
Transfer rate: 42242.72 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 8 47 20.5 46 121
Processing: 6 75 16.2 76 116
Waiting: 1 13 6.8 12 49
Total: 37 122 21.6 122 196
Percentage of the requests served within a certain time (ms)
50% 122
66% 128
75% 133
80% 137
90% 151
95% 160
98% 169
99% 172
100% 196 (longest request)
Separate host
Those tests were performed from one cache server to the other, to avoid the benchmarking tool fighting for resources with the server.
In .siege/siege.conf
:
verbose = false
fullurl = true
concurrent = 100
time = 2M
url = https://blog.torproject.org/
delay = 1
internet = false
benchmark = true
Siege:
root@cache-02:~# siege
** SIEGE 4.0.4
** Preparing 100 concurrent users for battle.
The server is now under siege...
Lifting the server siege...
Transactions: 28895 hits
Availability: 100.00 %
Elapsed time: 119.73 secs
Data transferred: 285.18 MB
Response time: 0.40 secs
Transaction rate: 241.33 trans/sec
Throughput: 2.38 MB/sec
Concurrency: 96.77
Successful transactions: 28895
Failed transactions: 0
Longest transaction: 1.26
Shortest transaction: 0.05
Load went to about 2 (Load average: 1.65 0.80 0.36
after test), with
one CPU constantly busy and the other at about 50%, memory usage was
low (~800M).
ab:
# ab -c 100 -n 1000 https://blog.torproject.org/
[...]
Server Software: ATS/8.0.2
Server Hostname: blog.torproject.org
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,4096,256
Server Temp Key: X25519 253 bits
TLS Server Name: blog.torproject.org
Document Path: /
Document Length: 53320 bytes
Concurrency Level: 100
Time taken for tests: 4.010 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 54421000 bytes
HTML transferred: 53320000 bytes
Requests per second: 249.37 [#/sec] (mean)
Time per request: 401.013 [ms] (mean)
Time per request: 4.010 [ms] (mean, across all concurrent requests)
Transfer rate: 13252.82 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 23 254 150.0 303 549
Processing: 14 119 89.3 122 361
Waiting: 5 105 89.7 105 356
Total: 37 373 214.9 464 738
Percentage of the requests served within a certain time (ms)
50% 464
66% 515
75% 549
80% 566
90% 600
95% 633
98% 659
99% 675
100% 738 (longest request)
Bombardier results are much better and almost max out the gigabit connexion:
anarcat@cache-02:~$ ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ -c 100
Bombarding https://blog.torproject.org:443/ for 2m0s using 100 connection(s)
[=========================================================================] 2m0s
Done!
Statistics Avg Stdev Max
Reqs/sec 2049.82 533.46 7083.03
Latency 49.75ms 20.82ms 837.07ms
Latency Distribution
50% 48.53ms
75% 57.98ms
90% 69.05ms
95% 78.44ms
99% 128.34ms
HTTP codes:
1xx - 0, 2xx - 241187, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 104.67MB/s
It might be because it supports doing HTTP/2 requests and, indeed, the
Throughput
drops down to 14MB/s
when we use the --http1
flag,
along with rates closer to ab:
anarcat@cache-02:~$ ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ --http1 -c 100
Bombarding https://blog.torproject.org:443/ for 2m0s using 100 connection(s)
[=========================================================================] 2m0s
Done!
Statistics Avg Stdev Max
Reqs/sec 1322.21 253.18 1911.21
Latency 78.40ms 18.65ms 688.60ms
Latency Distribution
50% 75.53ms
75% 88.52ms
90% 101.30ms
95% 110.68ms
99% 132.89ms
HTTP codes:
1xx - 0, 2xx - 153114, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 14.22MB/s
Inter-server communication is good, according to iperf3
:
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.04 sec 1.00 GBytes 859 Mbits/sec receiver
So we see the roundtrip does add significant overhead to ab and
siege. It's possible this is due to the nature of the virtual server,
much less powerful than the server. This seems to be confirmed by
bombardieer
's success, since it's possibly better designed than the
other two to maximize resources on the client side.
Nginx
Summary of online reviews
Pros:
- provides full webserver stack means much more flexibility, possibility of converging over a single solution across the infrastructure
- very popular
- load balancing (but no active check in free version)
- can serve static content
- HTTP/2
- HTTPS
Cons:
- provides full webserver stack (!) means larger attack surface
- no ESI or ICP?
- does not cache out of the box, requires config which might imply lesser performance
- opencore model with paid features, especially "active health checks", "Cache Purging API" (although there are hackish ways to clear the cache and a module), and "session persistence based on cookies"
- most plugins are statically compiled in different "flavors", although it's possible to have dynamic modules
Used by Cloudflare, Dropbox, MaxCDN and Netflix.
First impressions
Pros:
- "approved" Puppet module
- single file configuration
- config easy to understand and fairly straightforward
- just frigging works
- easy to serve static content in case of problems
- can be leveraged for other applications
- performance comparable or better than ATS
Cons:
- default caching module uses MD5 as a hashing algorithm
- configuration refers to magic variables that are documented all
over the place (e.g. what is
$proxy_host
vs$host
?) - documentation mixes content from the commercial version which makes it difficult to tell what is actually possible
- reload may crash the server (instead of not reloading) on config errors
- no shiny dashboard like ATS
- manual cache sizing?
- detailed cache stats are only in the "plus" version
Configuration
picking the "light" debian package. The modules that would be interesting in others would be "cache purge" (from extras) and "geoip" (from full):
apt install nginx-light
Then drop this config file in /etc/nginx/sites-available
and symlink
into sites-enabled
:
server_names_hash_bucket_size 64;
proxy_cache_path /var/cache/nginx/ levels=1:2 keys_zone=blog:10m;
server {
listen 80;
listen [::]:80;
listen 443 ssl;
listen [::]:443 ssl;
ssl_certificate /etc/ssl/torproject/certs/blog.torproject.org.crt-chained;
ssl_certificate_key /etc/ssl/private/blog.torproject.org.key;
server_name blog.torproject.org;
proxy_cache blog;
location / {
proxy_pass https://live-tor-blog-8.pantheonsite.io;
proxy_set_header Host $host;
# cache 304
proxy_cache_revalidate on;
# add cookie to cache key
#proxy_cache_key "$host$request_uri$cookie_user";
# not sure what the cookie name is
proxy_cache_key $scheme$proxy_host$request_uri;
# allow serving stale content on error, timeout, or refresh
proxy_cache_use_stale error timeout updating;
# allow only first request through backend
proxy_cache_lock on;
# add header
add_header X-Cache-Status $upstream_cache_status;
}
}
... and reload nginx.
I tested that logged in users bypass the cache and things generally work well.
A key problem with Nginx is getting decent statistics out. The upstream nginx exporter supports only (basically) hits per second through the stub status module a very limited module shipped with core Nginx. The commercial version, Nginx Plus, supports a more extensive API which includes the hit rate, but that's not an option for us.
There are two solutions to work around this problem:
- create our own metrics using the Nginx Lua Prometheus module: this can have performance impacts and involves a custom configuration
- write and parse log files, that's the way the munin plugin
works - this could possibly be fed directly into mtail to
avoid storing logs on disk but still get the date (include
$upstream_cache_status
in the logs) - use a third-party module like vts or sts and the exporter to expose those metrics - the vts module doesn't seem to be very well maintained (no release since 2018) and it's unclear if this will work for our use case
Here's an example of how to do the mtail hack. First tell nginx to write to syslog, to act as a buffer, so that parsing doesn't slow processing, excerpt from the nginx.conf snippet:
# Log response times so that we can compute latency histograms
# (using mtail). Works around the lack of Prometheus
# instrumentation in NGINX.
log_format extended '$server_name:$server_port '
'$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" '
'$upstream_addr $upstream_response_time $request_time';
access_log syslog:server=unix:/dev/log,facility=local3,tag=nginx_access extended;
(We would also need to add $upstream_cache_status
in that format.)
Then count the different stats using mtail, excerpt from the mtail config snippet:
# Define the exported metrics.
counter nginx_http_request_total
counter nginx_http_requests by host, vhost, method, code, backend
counter nginx_http_bytes by host, vhost, method, code, backend
counter nginx_http_requests_ms by le, host, vhost, method, code, backend
/(?P<hostname>[-0-9A-Za-z._:]+) nginx_access: (?P<vhost>[-0-9A-Za-z._:]+) (?P<remote_addr>[0-9a-f\.:]+) - - \[^\](^\)+\] "(?P<request_method>[A-Z]+) (?P<request_uri>\S+) (?P<http_version>HTTP\/[0-9\.]+)" (?P<status>\d{3}) ((?P<response_size>\d+)|-) "[^"]*" "[^"]*" (?P<upstream_addr>[-0-9A-Za-z._:]+) ((?P<ups_resp_seconds>\d+\.\d+)|-) (?P<request_seconds>\d+)\.(?P<request_milliseconds>\d+)/ {
nginx_http_request_total++
# [...]
}
We'd also need to check the cache statuf in that parser.
A variation of the mtail hack was adopted in our design.
Benchmarks
ab:
root@cache-02:~# ab -c 100 -n 1000 https://blog.torproject.org/
[...]
Server Software: nginx/1.14.2
Server Hostname: blog.torproject.org
Server Port: 443
SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,4096,256
Server Temp Key: X25519 253 bits
TLS Server Name: blog.torproject.org
Document Path: /
Document Length: 53313 bytes
Concurrency Level: 100
Time taken for tests: 3.083 seconds
Complete requests: 1000
Failed requests: 0
Total transferred: 54458000 bytes
HTML transferred: 53313000 bytes
Requests per second: 324.31 [#/sec] (mean)
Time per request: 308.349 [ms] (mean)
Time per request: 3.083 [ms] (mean, across all concurrent requests)
Transfer rate: 17247.25 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 30 255 78.0 262 458
Processing: 18 35 19.2 28 119
Waiting: 7 19 7.4 18 58
Total: 81 290 88.3 291 569
Percentage of the requests served within a certain time (ms)
50% 291
66% 298
75% 303
80% 306
90% 321
95% 533
98% 561
99% 562
100% 569 (longest request)
About 50% faster than ATS.
Siege:
Transactions: 32246 hits
Availability: 100.00 %
Elapsed time: 119.57 secs
Data transferred: 1639.49 MB
Response time: 0.37 secs
Transaction rate: 269.68 trans/sec
Throughput: 13.71 MB/sec
Concurrency: 99.60
Successful transactions: 32246
Failed transactions: 0
Longest transaction: 1.65
Shortest transaction: 0.23
Almost an order of magnitude faster than ATS. Update: that's for the throughput. The transaction rate is actually similar, which implies the page size might have changed between benchmarks.
Bombardier:
anarcat@cache-02:~$ ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ -c 100
Bombarding https://blog.torproject.org:443/ for 2m0s using 100 connection(s)
[=========================================================================] 2m0s
Done!
Statistics Avg Stdev Max
Reqs/sec 2116.74 506.01 5495.77
Latency 48.42ms 34.25ms 2.15s
Latency Distribution
50% 37.19ms
75% 50.44ms
90% 89.58ms
95% 109.59ms
99% 169.69ms
HTTP codes:
1xx - 0, 2xx - 247827, 3xx - 0, 4xx - 0, 5xx - 0
others - 0
Throughput: 107.43MB/s
Almost maxes out the gigabit connexion as well, but only marginally faster (~3%?) than ATS.
Does not max theoritical gigabit maximal performance, which is apparently at around 118MB/s without jumbo frames (and 123MB/s with).
Varnish
Pros:
- specifically built for caching
- very flexible
- grace mode can keep objects even after TTL expired (when backends go down)
- third most popular, after Cloudflare and ATS
Cons:
- no HTTPS support on frontend or backend in the free version, would require stunnel hacks
- configuration is compiled and a bit weird
- static content needs to be generated in the config file, or sidecar
- no HTTP/2 support
Used by Fastly.
Fastly itself
We could just put Fastly in front of all this and shove the costs on there.
Pros:
- easy
- possibly free
Cons:
- might go over our quotas during large campaigns
- sending more of our visitors to Fastly, non-anonymously
Sources
Benchmarks:
- Bizety: Nginx vs Varnish vs Apache Traffic Server - High Level Comparison - "Each proxy server has strengths and weakness"
- ScaleScale: Nginx vs Varnish: which one is better? - nginx + tmpfs good alternative to varnish
- garron.me: Nginx + Varnish compared to Nginx - equivalent
- Uptime Made Easy: Nginx or Varnish Which is Faster? - equivalent
- kpayne.me: Apache Traffic Server as a Reverse Proxy - "According to blitz.io, Varnish and Traffic Server benchmark results are close. According to ab, Traffic Server is twice as fast as Varnish"
- University of Oslo: Performance Evaluation of the Apache Traffic Server and Varnish Reverse Proxies - "Varnish seems the more promising reverse proxy server"
- Loggly: Benchmarking 5 Popular Load Balancers: Nginx, HAProxy, Envoy, Traefik, and ALB
- SpinupWP: Page Caching: Varnish Vs Nginx FastCGI Cache 2018 Update - "Nginx FastCGI Cache is the clear winner when it comes to outright performance. It’s not only able to handle more requests per second, but also serve each request 55ms quicker on average."
Tutorials and documentation:
- Apache.org: Why Apache Traffic Server - upstream docs
- czerasz.com: Nginx Caching Tutorial - You Can Run Faster - tutorial
- Igor Cicimov: Apache Traffic Server as Caching Reverse Proxy - tutorial, "Apache TS presents a stable, fast and scalable caching proxy platform"
- Datanyze.com: Web Accelerators Market Share Report