AESNI not in use with openssl 1.0.1 on tor 0.2.3.14-alpha
The 0.2.3.14 states in the changelog(https://gitweb.torproject.org/tor.git/blob/tor-0.2.3.14-alpha:/ChangeLog) that aesni will be used. this does not seem to be the case:
# uname -a
FreeBSD metaverse.dfri.se 9.0-RELEASE FreeBSD 9.0-RELEASE #0: Tue Jan 3 07:46:30 UTC 2012 root@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
# sysctl -a |egrep 'hw.machine|hw.model'
hw.machine: amd64
hw.model: Intel(R) Xeon(R) CPU E31230 @ 3.20GHz
hw.machine_arch: amd64
OpenSSL build:
./config -shared --prefix=/usr/local/testbuild/
libevent-2.0.18-stable:
CFLAGS=-I/usr/local/testbuild/include LDFLAGS=-L/usr/local/testbuild/lib ./configure --prefix=/usr/local/testbuild && make && make install
tor-0.2.3.14alpha
./configure --with-openssl-dir=/usr/local/testbuild/lib --disable-asciidoc --enable-gcc-warnings-advisory --enable-gcc-hardening --enable-linker-hardening --with-libevent-dir=/usr/local/testbuild/lib --prefix=/usr/local/testbuild
Results from bench:
OpenSSL 0.9.8:
##### dmap
nbits=65536
digestmap_set: 40.31 ns per element
digestmap_get: 30.15 ns per element
digestset_add: 9.94 ns per element
digestset_isin: 5.56 ns per element.
Hits == 32866304
False positive rate on digestset: 0.23%
##### aes
1 bytes: 13.08 nsec per byte
2 bytes: 9.55 nsec per byte
4 bytes: 7.89 nsec per byte
8 bytes: 7.04 nsec per byte
16 bytes: 6.67 nsec per byte
32 bytes: 6.48 nsec per byte
64 bytes: 6.38 nsec per byte
128 bytes: 6.40 nsec per byte
256 bytes: 6.35 nsec per byte
512 bytes: 6.32 nsec per byte
1024 bytes: 6.31 nsec per byte
2048 bytes: 6.30 nsec per byte
4096 bytes: 6.30 nsec per byte
8192 bytes: 6.30 nsec per byte
##### cell_aes
509 bytes, misaligned by 0: 6.12 nsec per byte
509 bytes, misaligned by 1: 6.12 nsec per byte
509 bytes, misaligned by 2: 6.12 nsec per byte
509 bytes, misaligned by 3: 6.12 nsec per byte
509 bytes, misaligned by 4: 6.12 nsec per byte
509 bytes, misaligned by 5: 6.12 nsec per byte
509 bytes, misaligned by 6: 6.12 nsec per byte
509 bytes, misaligned by 7: 6.12 nsec per byte
509 bytes, misaligned by 8: 6.13 nsec per byte
509 bytes, misaligned by 9: 6.12 nsec per byte
509 bytes, misaligned by 10: 6.12 nsec per byte
509 bytes, misaligned by 11: 6.13 nsec per byte
509 bytes, misaligned by 12: 6.12 nsec per byte
509 bytes, misaligned by 13: 6.12 nsec per byte
509 bytes, misaligned by 14: 6.12 nsec per byte
509 bytes, misaligned by 15: 6.12 nsec per byte
##### cell_ops
Inbound cells: 3126.88 ns per cell. (6.14 ns per byte of payload)
Outbound cells: 3131.38 ns per cell. (6.15 ns per byte of payload)
OpenSSL 1.0.1:
##### dmap
nbits=65536
digestmap_set: 151.35 ns per element
digestmap_get: 123.08 ns per element
digestset_add: 40.74 ns per element
digestset_isin: 29.20 ns per element.
Hits == 32825344
False positive rate on digestset: 0.21%
##### aes
1 bytes: 36.85 nsec per byte
2 bytes: 24.55 nsec per byte
4 bytes: 17.58 nsec per byte
8 bytes: 14.48 nsec per byte
16 bytes: 11.47 nsec per byte
32 bytes: 10.53 nsec per byte
64 bytes: 10.05 nsec per byte
128 bytes: 3.21 nsec per byte
256 bytes: 2.65 nsec per byte
512 bytes: 2.36 nsec per byte
1024 bytes: 2.23 nsec per byte
2048 bytes: 2.16 nsec per byte
4096 bytes: 2.12 nsec per byte
8192 bytes: 2.10 nsec per byte
##### cell_aes
509 bytes, misaligned by 0: 2.74 nsec per byte
509 bytes, misaligned by 1: 2.74 nsec per byte
509 bytes, misaligned by 2: 2.74 nsec per byte
509 bytes, misaligned by 3: 2.74 nsec per byte
509 bytes, misaligned by 4: 2.74 nsec per byte
509 bytes, misaligned by 5: 2.74 nsec per byte
509 bytes, misaligned by 6: 2.74 nsec per byte
509 bytes, misaligned by 7: 2.74 nsec per byte
509 bytes, misaligned by 8: 2.74 nsec per byte
509 bytes, misaligned by 9: 2.74 nsec per byte
509 bytes, misaligned by 10: 2.74 nsec per byte
509 bytes, misaligned by 11: 2.74 nsec per byte
509 bytes, misaligned by 12: 2.74 nsec per byte
509 bytes, misaligned by 13: 2.74 nsec per byte
509 bytes, misaligned by 14: 2.74 nsec per byte
509 bytes, misaligned by 15: 2.74 nsec per byte
##### cell_ops
Inbound cells: 1414.43 ns per cell. (2.78 ns per byte of payload)
Outbound cells: 1518.10 ns per cell. (2.98 ns per byte of payload)
This is nowhere near the dramatic performance improvements seen in
legacy/trac#5406
For comparision, here are benchmarks from a machine that does not have AESNI, but tor benched against 0.9.8 and 1.0.1:
OpenSSL 0.9.8:
##### dmap
nbits=65536
digestmap_set: 40.31 ns per element
digestmap_get: 30.15 ns per element
digestset_add: 9.94 ns per element
digestset_isin: 5.56 ns per element.
Hits == 32866304
False positive rate on digestset: 0.23%
##### aes
1 bytes: 13.08 nsec per byte
2 bytes: 9.55 nsec per byte
4 bytes: 7.89 nsec per byte
8 bytes: 7.04 nsec per byte
16 bytes: 6.67 nsec per byte
32 bytes: 6.48 nsec per byte
64 bytes: 6.38 nsec per byte
128 bytes: 6.40 nsec per byte
256 bytes: 6.35 nsec per byte
512 bytes: 6.32 nsec per byte
1024 bytes: 6.31 nsec per byte
2048 bytes: 6.30 nsec per byte
4096 bytes: 6.30 nsec per byte
8192 bytes: 6.30 nsec per byte
##### cell_aes
509 bytes, misaligned by 0: 6.12 nsec per byte
509 bytes, misaligned by 1: 6.12 nsec per byte
509 bytes, misaligned by 2: 6.12 nsec per byte
509 bytes, misaligned by 3: 6.12 nsec per byte
509 bytes, misaligned by 4: 6.12 nsec per byte
509 bytes, misaligned by 5: 6.12 nsec per byte
509 bytes, misaligned by 6: 6.12 nsec per byte
509 bytes, misaligned by 7: 6.12 nsec per byte
509 bytes, misaligned by 8: 6.13 nsec per byte
509 bytes, misaligned by 9: 6.12 nsec per byte
509 bytes, misaligned by 10: 6.12 nsec per byte
509 bytes, misaligned by 11: 6.13 nsec per byte
509 bytes, misaligned by 12: 6.12 nsec per byte
509 bytes, misaligned by 13: 6.12 nsec per byte
509 bytes, misaligned by 14: 6.12 nsec per byte
509 bytes, misaligned by 15: 6.12 nsec per byte
##### cell_ops
Inbound cells: 3126.88 ns per cell. (6.14 ns per byte of payload)
Outbound cells: 3131.38 ns per cell. (6.15 ns per byte of payload)
OpenSSL 1.0.1:
##### dmap
nbits=65536
digestmap_set: 151.35 ns per element
digestmap_get: 123.08 ns per element
digestset_add: 40.74 ns per element
digestset_isin: 29.20 ns per element.
Hits == 32825344
False positive rate on digestset: 0.21%
##### aes
1 bytes: 36.85 nsec per byte
2 bytes: 24.55 nsec per byte
4 bytes: 17.58 nsec per byte
8 bytes: 14.48 nsec per byte
16 bytes: 11.47 nsec per byte
32 bytes: 10.53 nsec per byte
64 bytes: 10.05 nsec per byte
128 bytes: 3.21 nsec per byte
256 bytes: 2.65 nsec per byte
512 bytes: 2.36 nsec per byte
1024 bytes: 2.23 nsec per byte
2048 bytes: 2.16 nsec per byte
4096 bytes: 2.12 nsec per byte
8192 bytes: 2.10 nsec per byte
##### cell_aes
509 bytes, misaligned by 0: 2.74 nsec per byte
509 bytes, misaligned by 1: 2.74 nsec per byte
509 bytes, misaligned by 2: 2.74 nsec per byte
509 bytes, misaligned by 3: 2.74 nsec per byte
509 bytes, misaligned by 4: 2.74 nsec per byte
509 bytes, misaligned by 5: 2.74 nsec per byte
509 bytes, misaligned by 6: 2.74 nsec per byte
509 bytes, misaligned by 7: 2.74 nsec per byte
509 bytes, misaligned by 8: 2.74 nsec per byte
509 bytes, misaligned by 9: 2.74 nsec per byte
509 bytes, misaligned by 10: 2.74 nsec per byte
509 bytes, misaligned by 11: 2.74 nsec per byte
509 bytes, misaligned by 12: 2.74 nsec per byte
509 bytes, misaligned by 13: 2.74 nsec per byte
509 bytes, misaligned by 14: 2.74 nsec per byte
509 bytes, misaligned by 15: 2.74 nsec per byte
##### cell_ops
Inbound cells: 1414.43 ns per cell. (2.78 ns per byte of payload)
Outbound cells: 1518.10 ns per cell. (2.98 ns per byte of payload)
issue