From cc8e1368fb5f239988fc1d7a260ea8610512793c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Antoine=20Beaupr=C3=A9?= <anarcat@debian.org> Date: Mon, 11 Nov 2019 14:53:30 -0500 Subject: [PATCH] move nginx and ats install notes and benchmarks in the discussion section --- tsa/howto/cache.mdwn | 976 +++++++++++++++++++++---------------------- 1 file changed, 486 insertions(+), 490 deletions(-) diff --git a/tsa/howto/cache.mdwn b/tsa/howto/cache.mdwn index 06b40032..44b9ea58 100644 --- a/tsa/howto/cache.mdwn +++ b/tsa/howto/cache.mdwn @@ -46,229 +46,295 @@ To be clarified. TBD. -# Design +# Discussion -## Nginx +A discussion of the design of the new service, mostly. -picking the "light" debian package. The modules that would be -interesting in others would be "cache purge" (from extras) and "geoip" -(from full): +## Overview - apt install nginx-light +The original goal of this project is to create a pair of caching +servers in front of the blog to reduce the bandwidth costs we're being +charged there. -Then drop this config file in `/etc/nginx/sites-available` and symlink -into `sites-enabled`: +## Goals - server_names_hash_bucket_size 64; - proxy_cache_path /var/cache/nginx/ levels=1:2 keys_zone=blog:10m; +### Must have - server { - listen 80; - listen [::]:80; - listen 443 ssl; - listen [::]:443 ssl; - ssl_certificate /etc/ssl/torproject/certs/blog.torproject.org.crt-chained; - ssl_certificate_key /etc/ssl/private/blog.torproject.org.key; + * reduce the traffic on the blog, hosted at a costly provider (#32090) + * HTTPS support in the frontend and backend + * deployment through Puppet + * anonymized logs + * hit rate stats - server_name blog.torproject.org; - proxy_cache blog; +### Nice to have - location / { - proxy_pass https://live-tor-blog-8.pantheonsite.io; - proxy_set_header Host $host; + * provide a frontend for our existing mirror infrastructure, a + home-made CDN for TBB and other releases + * no on-disk logs + * cute dashboard or grafana integration + * well-maintained upstream Puppet module - # cache 304 - proxy_cache_revalidate on; +### Approvals required - # add cookie to cache key - #proxy_cache_key "$host$request_uri$cookie_user"; - # not sure what the cookie name is - proxy_cache_key $scheme$proxy_host$request_uri; + * approved and requested by vegas - # allow serving stale content on error, timeout, or refresh - proxy_cache_use_stale error timeout updating; - # allow only first request through backend - proxy_cache_lock on; +## Non-Goals - # add header - add_header X-Cache-Status $upstream_cache_status; - } - } + * global CDN for users outside of TPO + * geoDNS -... and reload nginx. +## Cost -I tested that logged in users bypass the cache and things generally -work well. +Somewhere between 11EUR and 100EUR/mth for bandwidth and hardware. -A key problem with Nginx is getting decent statistics out. The -[upstream nginx exporter](https://github.com/nginxinc/nginx-prometheus-exporter) supports only (basically) hits per second -through the [stub status module](http://nginx.org/en/docs/http/ngx_http_stub_status_module.html) a very limited module shipped with -core Nginx. The commercial version, Nginx Plus, supports a [more -extensive API](https://nginx.org/en/docs/http/ngx_http_api_module.html#api) which includes the hit rate, but that's not an -option for us. +We're getting apparently around 2.2M "page views" per month at +Pantheon. That is about 1 hit per second and 12 terabyte per month, +36Mbit/s on average: -There are two solutions to work around this problem: + $ qalc + > 2 200 000 ∕ (30d) to hertz - * create our own metrics using the [Nginx Lua Prometheus module](https://github.com/knyar/nginx-lua-prometheus): - this can have performance impacts and involves a custom - configuration - * write and parse log files, that's the way the [munin plugin](https://github.com/munin-monitoring/contrib/blob/master/plugins/nginx/nginx-cache-hit-rate) - works - this could possibly be fed *directly* into [mtail](https://github.com/google/mtail) to - avoid storing logs on disk but still get the date (include - [`$upstream_cache_status`](http://nginx.org/en/docs/http/ngx_http_upstream_module.html#var_upstream_cache_status) in the logs) - * use a third-party module like [vts](https://github.com/vozlt/nginx-module-vts) or [sts](https://github.com/vozlt/nginx-module-sts) and the - [exporter](https://github.com/hnlq715/nginx-vts-exporter) to expose those metrics - the vts module doesn't seem - to be very well maintained (no release since 2018) and it's unclear - if this will work for our use case + 2200000 / (30 * day) = approx. 0.84876543 Hz -Here's an example of how to do the mtail hack. First tell nginx to -write to syslog, to act as a buffer, so that parsing doesn't slow -processing, excerpt from the [nginx.conf snippet](https://git.autistici.org/ai3/float/blob/master/roles/nginx/templates/config/nginx.conf#L34): + > 2 200 000 * 5Mibyte - # Log response times so that we can compute latency histograms - # (using mtail). Works around the lack of Prometheus - # instrumentation in NGINX. - log_format extended '$server_name:$server_port ' - '$remote_addr - $remote_user [$time_local] ' - '"$request" $status $body_bytes_sent ' - '"$http_referer" "$http_user_agent" ' - '$upstream_addr $upstream_response_time $request_time'; + 2200000 * (5 * mebibyte) = 11.534336 terabytes - access_log syslog:server=unix:/dev/log,facility=local3,tag=nginx_access extended; + > 2 200 000 * 5Mibyte/(30d) to megabit / s -(We would also need to add `$upstream_cache_status` in that format.) + (2200000 * (5 * mebibyte)) / (30 * day) = approx. 35.599802 megabits / s -Then count the different stats using mtail, excerpt from the [mtail -config snippet](https://git.autistici.org/ai3/float/blob/master/roles/base/files/mtail/nginx.mtail): +Hetzner charges 1EUR/TB/month over our 1TB quota, so bandwidth would +cost 11EUR/month on average. If costs become prohibitive, we could +switch to a Hetzner VM which includ 20TB of traffic per month at costs +ranging from 3EUR/mth to 30EUR/mth depending on the VPS size (between +1 vCPU, 2GB ram, 20GB SSD and 8vCPU, 32GB ram and 240GB SSD). - # Define the exported metrics. - counter nginx_http_request_total - counter nginx_http_requests by host, vhost, method, code, backend - counter nginx_http_bytes by host, vhost, method, code, backend - counter nginx_http_requests_ms by le, host, vhost, method, code, backend +Dedicated servers start at 34EUR/mth (`EX42`, 64GB ram 2x4TB HDD) for +unlimited gigabit. - /(?P<hostname>[-0-9A-Za-z._:]+) nginx_access: (?P<vhost>[-0-9A-Za-z._:]+) (?P<remote_addr>[0-9a-f\.:]+) - - \[[^\]]+\] "(?P<request_method>[A-Z]+) (?P<request_uri>\S+) (?P<http_version>HTTP\/[0-9\.]+)" (?P<status>\d{3}) ((?P<response_size>\d+)|-) "[^"]*" "[^"]*" (?P<upstream_addr>[-0-9A-Za-z._:]+) ((?P<ups_resp_seconds>\d+\.\d+)|-) (?P<request_seconds>\d+)\.(?P<request_milliseconds>\d+)/ { +## Proposed Solution - nginx_http_request_total++ +Nginx will be deployed on two servers. ATS was found to be somewhat +difficult to configure and debug, while Nginx has a more "regular" +configuration file format. Furthermore, performance was equivalent or +better in Nginx. -We'd also need to check the cache statuf in that parser. +Finally, there is the possibility of converging all HTTP services +towards Nginx if desired, which would reduce the number of moving +parts in the infrastructure. -Update: cache status now written to on-disk, anonymised, log files and -can be parsed with lnav. see ticket #32239 for details, hit ratio -between 70 and 80% based on preliminary results. +## Launch checklist -References: +See [#32239](https://trac.torproject.org/projects/tor/ticket/32239). - * [NGINX Alphabetical index of variables](https://nginx.org/en/docs/varindex.html) - * [NGINX Module ngx_http_proxy_module](https://nginx.org/en/docs/http/ngx_http_proxy_module.html) - * [NGINX Content Caching](https://docs.nginx.com/nginx/admin-guide/content-cache/content-caching/) - * [NGINX Reverse Proxy](https://docs.nginx.com/nginx/admin-guide/web-server/reverse-proxy/) - * [perusio@github.com: Nginx configuration for running Drupal](https://github.com/perusio/drupal-with-nginx) - - interesting [snippet](https://github.com/perusio/drupal-with-nginx/blob/D7/apps/drupal/map_cache.conf) for cookies handling, not required - * [NGINX: Maximizing Drupal 8 Performance with NGINX, Part 2: Caching and Load Balancing](https://www.nginx.com/blog/maximizing-drupal-8-performance-nginx-part-ii-caching-load-balancing/) +## Benchmarking procedures -### Benchmarks +Will require a test VM (or two?) to hit the caches. -ab: +### Common procedure - root@cache-02:~# ab -c 100 -n 1000 https://blog.torproject.org/ - [...] - Server Software: nginx/1.14.2 - Server Hostname: blog.torproject.org - Server Port: 443 - SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,4096,256 - Server Temp Key: X25519 253 bits - TLS Server Name: blog.torproject.org + 1. punch a hole in the firewall to allow cache2 to access cache1 - Document Path: / - Document Length: 53313 bytes + iptables -I INPUT -s 78.47.61.104 -j ACCEPT + ip6tables -I INPUT -s 2a01:4f8:c010:25ff::1 -j ACCEPT - Concurrency Level: 100 - Time taken for tests: 3.083 seconds - Complete requests: 1000 - Failed requests: 0 - Total transferred: 54458000 bytes - HTML transferred: 53313000 bytes - Requests per second: 324.31 [#/sec] (mean) - Time per request: 308.349 [ms] (mean) - Time per request: 3.083 [ms] (mean, across all concurrent requests) - Transfer rate: 17247.25 [Kbytes/sec] received + 2. point the blog to cache1 on cache2 in `/etc/hosts`: - Connection Times (ms) - min mean[+/-sd] median max - Connect: 30 255 78.0 262 458 - Processing: 18 35 19.2 28 119 - Waiting: 7 19 7.4 18 58 - Total: 81 290 88.3 291 569 + 116.202.120.172 blog.torproject.org + 2a01:4f8:fff0:4f:266:37ff:fe26:d6e1 blog.torproject.org - Percentage of the requests served within a certain time (ms) - 50% 291 - 66% 298 - 75% 303 - 80% 306 - 90% 321 - 95% 533 - 98% 561 - 99% 562 - 100% 569 (longest request) + 3. disable Puppet: + + puppet agent --disable 'benchmarking requires /etc/hosts override' -About 50% faster than ATS. + 4. launch the benchmark -Siege: +### Siege - Transactions: 32246 hits - Availability: 100.00 % - Elapsed time: 119.57 secs - Data transferred: 1639.49 MB - Response time: 0.37 secs - Transaction rate: 269.68 trans/sec - Throughput: 13.71 MB/sec - Concurrency: 99.60 - Successful transactions: 32246 - Failed transactions: 0 - Longest transaction: 1.65 - Shortest transaction: 0.23 +Siege configuration sample: -Almost an order of magnitude faster than ATS. +``` +verbose = false +fullurl = true +concurrent = 100 +time = 2M +url = http://www.example.com/ +delay = 1 +internet = false +benchmark = true +``` -Bombardier: +Might require this, which might work only with varnish: - anarcat@cache-02:~$ ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ -c 100 - Bombarding https://blog.torproject.org:443/ for 2m0s using 100 connection(s) - [=========================================================================] 2m0s - Done! - Statistics Avg Stdev Max - Reqs/sec 2116.74 506.01 5495.77 - Latency 48.42ms 34.25ms 2.15s - Latency Distribution - 50% 37.19ms - 75% 50.44ms - 90% 89.58ms - 95% 109.59ms - 99% 169.69ms - HTTP codes: - 1xx - 0, 2xx - 247827, 3xx - 0, 4xx - 0, 5xx - 0 - others - 0 - Throughput: 107.43MB/s +``` +proxy-host = 209.44.112.101 +proxy-port = 80 +``` -Almost maxes out the gigabit connexion as well, but only marginally -faster (~3%?) than ATS. +Alternative is to hack `/etc/hosts`. -Does not max theoritical gigabit maximal performance, which [is -apparently](http://rickardnobel.se/actual-throughput-on-gigabit-ethernet/) at around 118MB/s without jumbo frames (and 123MB/s -with). +### apachebench -## ATS +Classic commandline: - apt install trafficserver + ab2 -n 1000 -c 100 -X cache01.torproject.org https://example.com/ -Default Debian config seems sane when compared to the [Cicimov -tutorial][cicimov]. On thing we will need to change is the [default listening -port][], which is by default: +`-X` also doesn't work with ATS, hacked `/etc/hosts`. -[default listening port]: https://docs.trafficserver.apache.org/en/8.0.x/admin-guide/files/records.config.en.html#proxy.config.http.server_ports +### bombardier - CONFIG proxy.config.http.server_ports STRING 8080 8080:ipv6 +Unfortunately, the [bombardier package in Debian](https://tracker.debian.org/pkg/bombardier) is *not* the HTTP +benchmarking tool but a commandline game. It's still possible to +install it in Debian with: + + export GOPATH=$HOME/go + apt install golang + go get -v github.com/codesenberg/bombardier + +Then running the benchmark is as simple as: + + ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ + +Baseline benchmark, from cache02: + + anarcat@cache-02:~$ ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ -c 100 + Bombarding https://blog.torproject.org:443/ for 2m0s using 100 connection(s) + [================================================================================================================================================================] 2m0s + Done! + Statistics Avg Stdev Max + Reqs/sec 2796.01 716.69 6891.48 + Latency 35.96ms 22.59ms 1.02s + Latency Distribution + 50% 33.07ms + 75% 40.06ms + 90% 47.91ms + 95% 54.66ms + 99% 75.69ms + HTTP codes: + 1xx - 0, 2xx - 333646, 3xx - 0, 4xx - 0, 5xx - 0 + others - 0 + Throughput: 144.79MB/s + +This is strangely much higher, in terms of throughput, and faster, in +terms of latency, than testing against our own servers. Different +avenues were explored to explain that disparity with our servers: + + * jumbo frames? nope, both connexions see packets larger than 1500 + bytes + * protocol differences? nope, both go over IPv6 and (probably) HTTP/2 + (at least not over UDP) + * different link speeds + +The last theory is currently the only one standing. Indeed, 144.79MB/s +should not be possible on regular gigabit ethernet (GigE), as it is +actually *more* than 1000Mbit/s (1158.32Mbit/s). Sometimes the above +benchmark even gives 152MB/s (1222Mbit/s), way beyond what a regular +GigE link should be able to provide. + +### Other tools + +Siege has trouble going above ~100 concurrent clients because of its +design (and ulimit) limitations. Its interactive features are also +limited, here's a set of interesting alternatives: + + * [bombardier](https://github.com/codesenberg/bombardier) - golang, HTTP/2, better performance than siege in + my (2017) tests, not in debian + * [boom](https://github.com/tarekziade/boom) - python rewrite of apachebench, supports duration, + HTTP/2, not in debian, unsearchable name + * [go-wrk](https://github.com/adjust/go-wrk/) - golang rewrite of wrk with HTTPS, had performance + issues in my first tests (2017), [no duration target](https://github.com/adjust/go-wrk/issues/2), not in + Debian + * [hey](https://github.com/rakyll/hey) - golang rewrite of apachebench, similar to boom, not in + debian ([ITP #943596](https://bugs.debian.org/943596)), unsearchable name + * [Jmeter](https://jmeter.apache.org/) - interactive behavior, can replay recorded sessions + from browsers + * [Locust](https://locust.io/) - distributed, can model login and interactive + behavior, not in Debian + * [Tsung](http://tsung.erlang-projects.org/1/01/about/) - multi-protocol, distributed, erlang + * [wrk](https://github.com/wg/wrk/) - multithreaded, epoll, Lua scriptable, no HTTPS, only in + Debian unstable + +## Alternatives considered + +Four alternatives were seriously considered: + + * Apache Traffic Server + * Nginx proxying + caching + * Varnish + stunnel + * Fastly + +Other alternatives were not: + + * [Apache HTTPD caching](https://httpd.apache.org/docs/2.4/caching.html) - performance expected to be sub-par + * [Envoy][] - [not designed for caching](https://github.com/envoyproxy/envoy/issues/868), [external cache support + planned in 2019](https://blog.getambassador.io/envoy-proxy-in-2019-security-caching-wasm-http-3-and-more-e5ba82da0197?gi=82c1a78157b8) + * [HAproxy](https://www.haproxy.com/) - [not designed to cache large objects](https://www.haproxy.com/documentation/aloha/9-5/traffic-management/lb-layer7/caching-small-objects/) + * [Ledge](https://github.com/ledgetech/ledge) - caching extension to Nginx with ESI, Redis, and cache + purge support, not packaged in Debian + * [Nuster](https://github.com/jiangwenyuan/nuster) - new project, not packaged in Debian (based on + HAproxy), performance [comparable with nginx and varnish](https://github.com/jiangwenyuan/nuster/wiki/Web-cache-server-performance-benchmark:-nuster-vs-nginx-vs-varnish-vs-squid#results) + according to upstream, although impressive improvements + * [Polipo](https://en.wikipedia.org/wiki/Polipo) - not designed for production use + * [Squid](http://www.squid-cache.org/) - not designed as a reverse proxy + * [Traefik](https://traefik.io/) - [not designed for caching](https://github.com/containous/traefik/issues/878) + +[Envoy]: https://www.envoyproxy.io/ + +### Apache Traffic Server + +#### Summary of online reviews + +Pros: + + * HTTPS + * HTTP/2 + * industry leader (behind cloudflare) + * out of the box clustering support + +Cons: + + * load balancing is an experimental plugin (at least in 2016) + * no static file serving? or slower? + * no commercial support + +Used by Yahoo, Apple and Comcast. + +#### First impressions + +Pros: + + * [Puppet module available](https://forge.puppet.com/brainsware/trafficserver) + * no query logging by default (good?) + * good documentation, but a bit lacking in tutorials + * nice little dashboard shipped by default (`traffic_top`) although + it could be more useful (doesn't seem to show hit ratio clearly) + +Cons: + + * configuration spread out over many different configuration file + * complex and arcane configuration language (e.g. try to guess what + this actually does:: `CONFIG proxy.config.http.server_ports STRING + 8080:ipv6:tr-full 443:ssl + ip-in=192.168.17.1:80:ip-out=[fc01:10:10:1::1]:ip-out=10.10.10.1`) + * configuration syntax varies across config files and plugins + * <del>couldn't decouple backend hostname and passed `Host` + header</del> bad random tutorial found on the internet + * couldn't figure out how to make HTTP/2 work + * no prometheus exporters + + +#### Configuration + + apt install trafficserver + +Default Debian config seems sane when compared to the [Cicimov +tutorial][cicimov]. On thing we will need to change is the [default listening +port][], which is by default: + +[default listening port]: https://docs.trafficserver.apache.org/en/8.0.x/admin-guide/files/records.config.en.html#proxy.config.http.server_ports + + CONFIG proxy.config.http.server_ports STRING 8080 8080:ipv6 We want something more like this: @@ -312,9 +378,9 @@ And finally curl is able to talk to the proxy: curl --proxy-cacert /etc/ssl/torproject-auto/servercerts/ca.crt --proxy https://cache01.torproject.org/ https://blog.torproject.org -### Troubleshooting +#### Troubleshooting -#### Proxy fails to hit backend: +##### Proxy fails to hit backend: curl: (56) Received HTTP code 404 from proxy after CONNECT @@ -362,15 +428,15 @@ the `Foo` header appears in the request. The solution to this is the `proxy.config.url_remap.pristine_host_hdr` documented above. -#### HTTP/2 support missing +##### HTTP/2 support missing Next hurdle: no HTTP/2 support, even when using `proto=http2;http` (falls back on `HTTP/1.1`) and `proto=http2` only (fails with `WARNING: Unregistered protocol type 0`). -### Benchmarks +#### Benchmarks -#### Same host tests +##### Same host tests With `blog.tpo` in `/etc/hosts`, because `proxy-host` doesn't work, and running on the same host as the proxy (!), cold cache: @@ -483,7 +549,7 @@ ab: 99% 172 100% 196 (longest request) -#### Separate host +##### Separate host Those tests were performed from one cache server to the other, to avoid the benchmarking tool fighting for resources with the server. @@ -581,321 +647,44 @@ connexion: 75% 57.98ms 90% 69.05ms 95% 78.44ms - 99% 128.34ms - HTTP codes: - 1xx - 0, 2xx - 241187, 3xx - 0, 4xx - 0, 5xx - 0 - others - 0 - Throughput: 104.67MB/s - -It might be because it supports doing HTTP/2 requests and, indeed, the -`Throughput` drops down to `14MB/s` when we use the `--http1` flag, -along with rates closer to ab: - - anarcat@cache-02:~$ ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ --http1 -c 100 - Bombarding https://blog.torproject.org:443/ for 2m0s using 100 connection(s) - [=========================================================================] 2m0s - Done! - Statistics Avg Stdev Max - Reqs/sec 1322.21 253.18 1911.21 - Latency 78.40ms 18.65ms 688.60ms - Latency Distribution - 50% 75.53ms - 75% 88.52ms - 90% 101.30ms - 95% 110.68ms - 99% 132.89ms - HTTP codes: - 1xx - 0, 2xx - 153114, 3xx - 0, 4xx - 0, 5xx - 0 - others - 0 - Throughput: 14.22MB/s - -Inter-server communication is good, according to `iperf3`: - - [ ID] Interval Transfer Bitrate - [ 5] 0.00-10.04 sec 1.00 GBytes 859 Mbits/sec receiver - -So we see the roundtrip does add significant overhead to ab and -siege. It's possible this is due to the nature of the virtual server, -much less powerful than the server. This seems to be confirmed by -`bombardieer`'s success, since it's possibly better designed than the -other two to maximize resources on the client side. - -# Discussion - -A discussion of the design of the new service, mostly. - -## Overview - -The original goal of this project is to create a pair of caching -servers in front of the blog to reduce the bandwidth costs we're being -charged there. - -## Goals - -### Must have - - * reduce the traffic on the blog, hosted at a costly provider (#32090) - * HTTPS support in the frontend and backend - * deployment through Puppet - * anonymized logs - * hit rate stats - -### Nice to have - - * provide a frontend for our existing mirror infrastructure, a - home-made CDN for TBB and other releases - * no on-disk logs - * cute dashboard or grafana integration - * well-maintained upstream Puppet module - -### Approvals required - - * approved and requested by vegas - -## Non-Goals - - * global CDN for users outside of TPO - * geoDNS - -## Proposed Solution - -Nginx will be deployed on two servers. ATS was found to be somewhat -difficult to configure and debug, while Nginx has a more "regular" -configuration file format. Furthermore, performance was equivalent or -better in Nginx. - -Finally, there is the possibility of converging all HTTP services -towards Nginx if desired, which would reduce the number of moving -parts in the infrastructure. - -## Launch checklist - -See [#32239](https://trac.torproject.org/projects/tor/ticket/32239). - -## Benchmarking procedures - -Will require a test VM (or two?) to hit the caches. - -### Common procedure - - 1. punch a hole in the firewall to allow cache2 to access cache1 - - iptables -I INPUT -s 78.47.61.104 -j ACCEPT - ip6tables -I INPUT -s 2a01:4f8:c010:25ff::1 -j ACCEPT - - 2. point the blog to cache1 on cache2 in `/etc/hosts`: - - 116.202.120.172 blog.torproject.org - 2a01:4f8:fff0:4f:266:37ff:fe26:d6e1 blog.torproject.org - - 3. disable Puppet: - - puppet agent --disable 'benchmarking requires /etc/hosts override' - - 4. launch the benchmark - -### Siege - -Siege configuration sample: - -``` -verbose = false -fullurl = true -concurrent = 100 -time = 2M -url = http://www.example.com/ -delay = 1 -internet = false -benchmark = true -``` - -Might require this, which might work only with varnish: - -``` -proxy-host = 209.44.112.101 -proxy-port = 80 -``` - -Alternative is to hack `/etc/hosts`. - -### apachebench - -Classic commandline: - - ab2 -n 1000 -c 100 -X cache01.torproject.org https://example.com/ - -`-X` also doesn't work with ATS, hacked `/etc/hosts`. - -### bombardier - -Unfortunately, the [bombardier package in Debian](https://tracker.debian.org/pkg/bombardier) is *not* the HTTP -benchmarking tool but a commandline game. It's still possible to -install it in Debian with: - - export GOPATH=$HOME/go - apt install golang - go get -v github.com/codesenberg/bombardier - -Then running the benchmark is as simple as: - - ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ - -Baseline benchmark, from cache02: - - anarcat@cache-02:~$ ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ -c 100 - Bombarding https://blog.torproject.org:443/ for 2m0s using 100 connection(s) - [================================================================================================================================================================] 2m0s - Done! - Statistics Avg Stdev Max - Reqs/sec 2796.01 716.69 6891.48 - Latency 35.96ms 22.59ms 1.02s - Latency Distribution - 50% 33.07ms - 75% 40.06ms - 90% 47.91ms - 95% 54.66ms - 99% 75.69ms - HTTP codes: - 1xx - 0, 2xx - 333646, 3xx - 0, 4xx - 0, 5xx - 0 - others - 0 - Throughput: 144.79MB/s - -This is strangely much higher, in terms of throughput, and faster, in -terms of latency, than testing against our own servers. Different -avenues were explored to explain that disparity with our servers: - - * jumbo frames? nope, both connexions see packets larger than 1500 - bytes - * protocol differences? nope, both go over IPv6 and (probably) HTTP/2 - (at least not over UDP) - * different link speeds - -The last theory is currently the only one standing. Indeed, 144.79MB/s -should not be possible on regular gigabit ethernet (GigE), as it is -actually *more* than 1000Mbit/s (1158.32Mbit/s). Sometimes the above -benchmark even gives 152MB/s (1222Mbit/s), way beyond what a regular -GigE link should be able to provide. - -### Other tools - -Siege has trouble going above ~100 concurrent clients because of its -design (and ulimit) limitations. Its interactive features are also -limited, here's a set of interesting alternatives: - - * [bombardier](https://github.com/codesenberg/bombardier) - golang, HTTP/2, better performance than siege in - my (2017) tests, not in debian - * [boom](https://github.com/tarekziade/boom) - python rewrite of apachebench, supports duration, - HTTP/2, not in debian, unsearchable name - * [go-wrk](https://github.com/adjust/go-wrk/) - golang rewrite of wrk with HTTPS, had performance - issues in my first tests (2017), [no duration target](https://github.com/adjust/go-wrk/issues/2), not in - Debian - * [hey](https://github.com/rakyll/hey) - golang rewrite of apachebench, similar to boom, not in - debian ([ITP #943596](https://bugs.debian.org/943596)), unsearchable name - * [Jmeter](https://jmeter.apache.org/) - interactive behavior, can replay recorded sessions - from browsers - * [Locust](https://locust.io/) - distributed, can model login and interactive - behavior, not in Debian - * [Tsung](http://tsung.erlang-projects.org/1/01/about/) - multi-protocol, distributed, erlang - * [wrk](https://github.com/wg/wrk/) - multithreaded, epoll, Lua scriptable, no HTTPS, only in - Debian unstable - -## Cost - -Somewhere between 11EUR and 100EUR/mth for bandwidth and hardware. - -We're getting apparently around 2.2M "page views" per month at -Pantheon. That is about 1 hit per second and 12 terabyte per month, -36Mbit/s on average: - - $ qalc - > 2 200 000 ∕ (30d) to hertz - - 2200000 / (30 * day) = approx. 0.84876543 Hz - - > 2 200 000 * 5Mibyte - - 2200000 * (5 * mebibyte) = 11.534336 terabytes - - > 2 200 000 * 5Mibyte/(30d) to megabit / s - - (2200000 * (5 * mebibyte)) / (30 * day) = approx. 35.599802 megabits / s - -Hetzner charges 1EUR/TB/month over our 1TB quota, so bandwidth would -cost 11EUR/month on average. If costs become prohibitive, we could -switch to a Hetzner VM which includ 20TB of traffic per month at costs -ranging from 3EUR/mth to 30EUR/mth depending on the VPS size (between -1 vCPU, 2GB ram, 20GB SSD and 8vCPU, 32GB ram and 240GB SSD). - -Dedicated servers start at 34EUR/mth (`EX42`, 64GB ram 2x4TB HDD) for -unlimited gigabit. - -## Alternatives considered - -Four alternatives were seriously considered: - - * Apache Traffic Server - * Nginx proxying + caching - * Varnish + stunnel - * Fastly - -Other alternatives were not: - - * [Apache HTTPD caching](https://httpd.apache.org/docs/2.4/caching.html) - performance expected to be sub-par - * [Envoy][] - [not designed for caching](https://github.com/envoyproxy/envoy/issues/868), [external cache support - planned in 2019](https://blog.getambassador.io/envoy-proxy-in-2019-security-caching-wasm-http-3-and-more-e5ba82da0197?gi=82c1a78157b8) - * [HAproxy](https://www.haproxy.com/) - [not designed to cache large objects](https://www.haproxy.com/documentation/aloha/9-5/traffic-management/lb-layer7/caching-small-objects/) - * [Ledge](https://github.com/ledgetech/ledge) - caching extension to Nginx with ESI, Redis, and cache - purge support, not packaged in Debian - * [Nuster](https://github.com/jiangwenyuan/nuster) - new project, not packaged in Debian (based on - HAproxy), performance [comparable with nginx and varnish](https://github.com/jiangwenyuan/nuster/wiki/Web-cache-server-performance-benchmark:-nuster-vs-nginx-vs-varnish-vs-squid#results) - according to upstream, although impressive improvements - * [Polipo](https://en.wikipedia.org/wiki/Polipo) - not designed for production use - * [Squid](http://www.squid-cache.org/) - not designed as a reverse proxy - * [Traefik](https://traefik.io/) - [not designed for caching](https://github.com/containous/traefik/issues/878) - -[Envoy]: https://www.envoyproxy.io/ - -### Apache Traffic Server - -#### Summary of online reviews - -Pros: - - * HTTPS - * HTTP/2 - * industry leader (behind cloudflare) - * out of the box clustering support - -Cons: - - * load balancing is an experimental plugin (at least in 2016) - * no static file serving? or slower? - * no commercial support - -Used by Yahoo, Apple and Comcast. + 99% 128.34ms + HTTP codes: + 1xx - 0, 2xx - 241187, 3xx - 0, 4xx - 0, 5xx - 0 + others - 0 + Throughput: 104.67MB/s -#### First impressions +It might be because it supports doing HTTP/2 requests and, indeed, the +`Throughput` drops down to `14MB/s` when we use the `--http1` flag, +along with rates closer to ab: -Pros: + anarcat@cache-02:~$ ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ --http1 -c 100 + Bombarding https://blog.torproject.org:443/ for 2m0s using 100 connection(s) + [=========================================================================] 2m0s + Done! + Statistics Avg Stdev Max + Reqs/sec 1322.21 253.18 1911.21 + Latency 78.40ms 18.65ms 688.60ms + Latency Distribution + 50% 75.53ms + 75% 88.52ms + 90% 101.30ms + 95% 110.68ms + 99% 132.89ms + HTTP codes: + 1xx - 0, 2xx - 153114, 3xx - 0, 4xx - 0, 5xx - 0 + others - 0 + Throughput: 14.22MB/s - * [Puppet module available](https://forge.puppet.com/brainsware/trafficserver) - * no query logging by default (good?) - * good documentation, but a bit lacking in tutorials - * nice little dashboard shipped by default (`traffic_top`) although - it could be more useful (doesn't seem to show hit ratio clearly) +Inter-server communication is good, according to `iperf3`: -Cons: + [ ID] Interval Transfer Bitrate + [ 5] 0.00-10.04 sec 1.00 GBytes 859 Mbits/sec receiver - * configuration spread out over many different configuration file - * complex and arcane configuration language (e.g. try to guess what - this actually does:: `CONFIG proxy.config.http.server_ports STRING - 8080:ipv6:tr-full 443:ssl - ip-in=192.168.17.1:80:ip-out=[fc01:10:10:1::1]:ip-out=10.10.10.1`) - * configuration syntax varies across config files and plugins - * <del>couldn't decouple backend hostname and passed `Host` - header</del> bad random tutorial found on the internet - * couldn't figure out how to make HTTP/2 work - * no prometheus exporters +So we see the roundtrip does add significant overhead to ab and +siege. It's possible this is due to the nature of the virtual server, +much less powerful than the server. This seems to be confirmed by +`bombardieer`'s success, since it's possibly better designed than the +other two to maximize resources on the client side. ### Nginx @@ -952,6 +741,213 @@ Cons: * [detailed cache stats][] are only in the "plus" version [detailed cache stats]: https://docs.nginx.com/nginx/admin-guide/monitoring/live-activity-monitoring/ + +#### Configuration + +picking the "light" debian package. The modules that would be +interesting in others would be "cache purge" (from extras) and "geoip" +(from full): + + apt install nginx-light + +Then drop this config file in `/etc/nginx/sites-available` and symlink +into `sites-enabled`: + + server_names_hash_bucket_size 64; + proxy_cache_path /var/cache/nginx/ levels=1:2 keys_zone=blog:10m; + + server { + listen 80; + listen [::]:80; + listen 443 ssl; + listen [::]:443 ssl; + ssl_certificate /etc/ssl/torproject/certs/blog.torproject.org.crt-chained; + ssl_certificate_key /etc/ssl/private/blog.torproject.org.key; + + server_name blog.torproject.org; + proxy_cache blog; + + location / { + proxy_pass https://live-tor-blog-8.pantheonsite.io; + proxy_set_header Host $host; + + # cache 304 + proxy_cache_revalidate on; + + # add cookie to cache key + #proxy_cache_key "$host$request_uri$cookie_user"; + # not sure what the cookie name is + proxy_cache_key $scheme$proxy_host$request_uri; + + # allow serving stale content on error, timeout, or refresh + proxy_cache_use_stale error timeout updating; + # allow only first request through backend + proxy_cache_lock on; + + # add header + add_header X-Cache-Status $upstream_cache_status; + } + } + +... and reload nginx. + +I tested that logged in users bypass the cache and things generally +work well. + +A key problem with Nginx is getting decent statistics out. The +[upstream nginx exporter](https://github.com/nginxinc/nginx-prometheus-exporter) supports only (basically) hits per second +through the [stub status module](http://nginx.org/en/docs/http/ngx_http_stub_status_module.html) a very limited module shipped with +core Nginx. The commercial version, Nginx Plus, supports a [more +extensive API](https://nginx.org/en/docs/http/ngx_http_api_module.html#api) which includes the hit rate, but that's not an +option for us. + +There are two solutions to work around this problem: + + * create our own metrics using the [Nginx Lua Prometheus module](https://github.com/knyar/nginx-lua-prometheus): + this can have performance impacts and involves a custom + configuration + * write and parse log files, that's the way the [munin plugin](https://github.com/munin-monitoring/contrib/blob/master/plugins/nginx/nginx-cache-hit-rate) + works - this could possibly be fed *directly* into [mtail](https://github.com/google/mtail) to + avoid storing logs on disk but still get the date (include + [`$upstream_cache_status`](http://nginx.org/en/docs/http/ngx_http_upstream_module.html#var_upstream_cache_status) in the logs) + * use a third-party module like [vts](https://github.com/vozlt/nginx-module-vts) or [sts](https://github.com/vozlt/nginx-module-sts) and the + [exporter](https://github.com/hnlq715/nginx-vts-exporter) to expose those metrics - the vts module doesn't seem + to be very well maintained (no release since 2018) and it's unclear + if this will work for our use case + +Here's an example of how to do the mtail hack. First tell nginx to +write to syslog, to act as a buffer, so that parsing doesn't slow +processing, excerpt from the [nginx.conf snippet](https://git.autistici.org/ai3/float/blob/master/roles/nginx/templates/config/nginx.conf#L34): + + # Log response times so that we can compute latency histograms + # (using mtail). Works around the lack of Prometheus + # instrumentation in NGINX. + log_format extended '$server_name:$server_port ' + '$remote_addr - $remote_user [$time_local] ' + '"$request" $status $body_bytes_sent ' + '"$http_referer" "$http_user_agent" ' + '$upstream_addr $upstream_response_time $request_time'; + + access_log syslog:server=unix:/dev/log,facility=local3,tag=nginx_access extended; + +(We would also need to add `$upstream_cache_status` in that format.) + +Then count the different stats using mtail, excerpt from the [mtail +config snippet](https://git.autistici.org/ai3/float/blob/master/roles/base/files/mtail/nginx.mtail): + + # Define the exported metrics. + counter nginx_http_request_total + counter nginx_http_requests by host, vhost, method, code, backend + counter nginx_http_bytes by host, vhost, method, code, backend + counter nginx_http_requests_ms by le, host, vhost, method, code, backend + + /(?P<hostname>[-0-9A-Za-z._:]+) nginx_access: (?P<vhost>[-0-9A-Za-z._:]+) (?P<remote_addr>[0-9a-f\.:]+) - - \[[^\]]+\] "(?P<request_method>[A-Z]+) (?P<request_uri>\S+) (?P<http_version>HTTP\/[0-9\.]+)" (?P<status>\d{3}) ((?P<response_size>\d+)|-) "[^"]*" "[^"]*" (?P<upstream_addr>[-0-9A-Za-z._:]+) ((?P<ups_resp_seconds>\d+\.\d+)|-) (?P<request_seconds>\d+)\.(?P<request_milliseconds>\d+)/ { + + nginx_http_request_total++ + +We'd also need to check the cache statuf in that parser. + +References: + + * [NGINX Alphabetical index of variables](https://nginx.org/en/docs/varindex.html) + * [NGINX Module ngx_http_proxy_module](https://nginx.org/en/docs/http/ngx_http_proxy_module.html) + * [NGINX Content Caching](https://docs.nginx.com/nginx/admin-guide/content-cache/content-caching/) + * [NGINX Reverse Proxy](https://docs.nginx.com/nginx/admin-guide/web-server/reverse-proxy/) + * [perusio@github.com: Nginx configuration for running Drupal](https://github.com/perusio/drupal-with-nginx) - + interesting [snippet](https://github.com/perusio/drupal-with-nginx/blob/D7/apps/drupal/map_cache.conf) for cookies handling, not required + * [NGINX: Maximizing Drupal 8 Performance with NGINX, Part 2: Caching and Load Balancing](https://www.nginx.com/blog/maximizing-drupal-8-performance-nginx-part-ii-caching-load-balancing/) + +#### Benchmarks + +ab: + + root@cache-02:~# ab -c 100 -n 1000 https://blog.torproject.org/ + [...] + Server Software: nginx/1.14.2 + Server Hostname: blog.torproject.org + Server Port: 443 + SSL/TLS Protocol: TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,4096,256 + Server Temp Key: X25519 253 bits + TLS Server Name: blog.torproject.org + + Document Path: / + Document Length: 53313 bytes + + Concurrency Level: 100 + Time taken for tests: 3.083 seconds + Complete requests: 1000 + Failed requests: 0 + Total transferred: 54458000 bytes + HTML transferred: 53313000 bytes + Requests per second: 324.31 [#/sec] (mean) + Time per request: 308.349 [ms] (mean) + Time per request: 3.083 [ms] (mean, across all concurrent requests) + Transfer rate: 17247.25 [Kbytes/sec] received + + Connection Times (ms) + min mean[+/-sd] median max + Connect: 30 255 78.0 262 458 + Processing: 18 35 19.2 28 119 + Waiting: 7 19 7.4 18 58 + Total: 81 290 88.3 291 569 + + Percentage of the requests served within a certain time (ms) + 50% 291 + 66% 298 + 75% 303 + 80% 306 + 90% 321 + 95% 533 + 98% 561 + 99% 562 + 100% 569 (longest request) + +About 50% faster than ATS. + +Siege: + + Transactions: 32246 hits + Availability: 100.00 % + Elapsed time: 119.57 secs + Data transferred: 1639.49 MB + Response time: 0.37 secs + Transaction rate: 269.68 trans/sec + Throughput: 13.71 MB/sec + Concurrency: 99.60 + Successful transactions: 32246 + Failed transactions: 0 + Longest transaction: 1.65 + Shortest transaction: 0.23 + +Almost an order of magnitude faster than ATS. + +Bombardier: + + anarcat@cache-02:~$ ./go/bin/bombardier --duration=2m --latencies https://blog.torproject.org/ -c 100 + Bombarding https://blog.torproject.org:443/ for 2m0s using 100 connection(s) + [=========================================================================] 2m0s + Done! + Statistics Avg Stdev Max + Reqs/sec 2116.74 506.01 5495.77 + Latency 48.42ms 34.25ms 2.15s + Latency Distribution + 50% 37.19ms + 75% 50.44ms + 90% 89.58ms + 95% 109.59ms + 99% 169.69ms + HTTP codes: + 1xx - 0, 2xx - 247827, 3xx - 0, 4xx - 0, 5xx - 0 + others - 0 + Throughput: 107.43MB/s + +Almost maxes out the gigabit connexion as well, but only marginally +faster (~3%?) than ATS. + +Does not max theoritical gigabit maximal performance, which [is +apparently](http://rickardnobel.se/actual-throughput-on-gigabit-ethernet/) at around 118MB/s without jumbo frames (and 123MB/s +with). + ### Varnish Pros: -- GitLab