Investigate possible TTTFB introduced delay (TLS Time To First Byte)
Investigate TLS Time To First Byte (TTTFB)
Tor talks to tor via same sized fixed observed TLS Record Layer: Application Data Protocol: Application Data Length per tor Cells delivered via wireshark filter tls.record.length
Cells ⇒ TLS Application Data Length:
- 1 tor Cells 531 bytes
- 2 tor Cells 1045 bytes
- 3 tor Cells 1559 bytes
- 4 tor Cells 2073 bytes
- 5 tor Cells 2587 bytes
- 6 tor Cells 3101 bytes
- 7 tor Cells 3615 bytes
- 8 tor Cells 4065 bytes
Therefore, the length is based on a starting value of 17 bytes plus the number of cells multiplied by 514 bytes.
17 bytes + (C * N)
N = number of cells
C = Number of bytes in a cell transmitted over the network, in the longest form CELL_MAX_NETWORK_SIZE
514 bytes
Seemed plausible to me so far, but hold on, how do we get to 4065 here, shouldn't it be 4129 for complete 8 tor Cells within a single TLS data record?
Yes, with increased buffers (buffer.c), tor would package 8 cells into a single 4129 length TLS Record.
The current behavior is therefore such that if a client has more than 7 cells of data to send or vice versa, receives more than 7 cells of data, every 8th cell is fragmented by the buffer limitation over at least 2 TLS records.
Tor buffer size = buf->default_chunk_size = 4096;
4096 - 17 - 31 (TLS overhead?) = 4065 bytes limitation
Transport Layer Security
TLSv1.2 Record Layer: Application Data Protocol: Application Data
Content Type: Application Data (23)
Version: TLS 1.2 (0x0303)
Length: 4065
Encrypted Application Data: …
TLS can only decrypt the application data and pass it on to the underlying application if it has received a complete data set. Which is 4065 bytes in this case. Accordingly, the fragmentation seems particularly susceptible to latency fluctuations. If we assume that we are sending and receiving with full buffers, at least every 8th cell will be fragmented by TLS. To merge this single 8th cell from 2 different following TLS records, we need to receive full 2x 4065 bytes.
Therefore, we are only able to read the data of the 8th cell as soon as the data of the following 7 cells have been completely transmitted. Worse still, assuming a TCP MTU of 1280 as the minimum ipv6 MTU as used by not fewer relays today, a single TLS record in its currently largest permitted form will be fragmented across many TCP segments. In the event that a single TCP segment of those has to be retransmitted, the transmission of the entire TLS record is delayed by a whole round trip time. All other cells contained therein are affected by this artificial delay before they can be read, although they are already preserved and buffered. A smaller record size would help match the record size to the segments that TCP is sending.
Wouldn't we at least want to prevent the TLS conditional, known fixed data cell size fragmentation and accordingly choose a value aligned with cell boundaries that can hold a complete number of cells?
Buffer size required for fitting 8 cells into single TLS record increase by 64 bytes from 4096 to 4160.
(514 * 8) + 17 + 31 = 4160
Resulting:
Transport Layer Security
TLSv1.2 Record Layer: Application Data Protocol: Application Data
Content Type: Application Data (23)
Version: TLS 1.2 (0x0303)
Length: 4129
Encrypted Application Data: …
KIST use currently 4160 bytes (CELL_MAX_NETWORK_SIZE * 8)
in decision outbuf should write too, but comment of decision reads “only” 4096 bytes!?
#40008 (closed)
Another imaginable optimization could be to move away from fixed size limits to dynamic record sizes based on the maximum amount of bytes that can be transferred in a single TCP segment. http://eweiibe6tdjsdprb4px6rqrzzcsi22m4koia44kc5pcjr7nec2rlxyad.onion/tpo/core/tor/-/issues/40006#note_2679578
buffer2cells-1076bytes-single-tcp-segment.patch
Disadvantage could be a slightly higher TLS overhead, by decreased size. Current heartbeat logs show around 3-4% TLS overhead on average?
4 KB doesn't seem like a bad choice, at first glance. The decision was made for the reason that it is believed to be capable of absorbing and transmitting 8 cells in one go Many implementations use similar values, 4 KB on 32bit and 8 KB on 64bit platform. Commonly used nginx web server used to use 4 KB TLS Record Layer Size too in now obsolete versions but defaults to 16 KB now. 16384 bytes.
We not only have mobile clients with high latency and asymmetric line bandwidths, but also long paths within the nodes, often across continents. Therefore naturally high RTT environment. With additional usual highly congested links. With Head-of-line blocking (HOL blocking) on multiplexed single TCP connections.
I have attached patches with changed buffers as PoC for testing. disclaimer: I am in no way an expert on TLS or TCP/IP buffers. My information comes from my resulting personal conclusion from reading source code, trac tickets decisions, some related paper as the kist paper and from observations of private traffic from own clients and own server via wireshark and performing rtt and throughput benchmarks.