Reduce KCP bottlenecks for Snowflake
I did some diving into the KCP code and played around with the parameters because of the throughput issues we're having with #25723. Even without multiplexing I was able to drastically increase the throughput of snowflake. It looks like the bottleneck was preventing us from even using the full capacity of a single snowflake, let alone allowing for improvements by splitting across several.
KCP sends data in segments, and will only send a maximum number of segments before the other side acknowledges them. The default window size is 32 segments. Also by default, there is a 1:1 mapping between segments and a call to
Send on the KCP connection.
I tried doubling window size from 32 segments to 64 and got pretty much a doubling of bandwidth from a single snowflake in my local test environment. We need to set this at both the client and server for it to have any affect because the effective window size is the minimum of the sender's send window and the receiver's advertized receive window.
Stream mode causes the sender to fragment segments. A call to
Send will append data to the previous segment, up to the maximum segment size. I was able to more than double throughput by setting this. This has to be set on the sender's side, so assuming most of our bandwidth needs are download speeds it should be set at the server.