Skip to content
Snippets Groups Projects

Increase kcp and smux window and buffer sizes

Closed Cecylia Bocovich requested to merge cohosh/snowflake:max_buffers into main

This should increase the maximum amount of inflight data and hopefully the performance of Snowflake, especially for clients geographically distant from proxies and the server.

See https://lists.torproject.org/pipermail/anti-censorship-team/2021-July/000185.html for a justification of this change.

Merge request reports

Approved by

Closed by Cecylia BocovichCecylia Bocovich 3 years ago (Aug 10, 2021 7:59pm UTC)

Merge details

  • The changes were not merged into main.

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • David Fifield requested review from @dcf

    requested review from @dcf

  • It also occurs to me that the kcp.MaxWindowSize has to be at least as big as the MaxStreamBuffer size to notice any improvements, otherwise that will be the limiting factor on the amount of inflight data. Right now this is set to 64KB for both the client and the server.

    I believe UDPSession.SetWindowSize is in units of packets, not bytes. The current values of 65535 should be plenty. Does it still work with the current values? The kcptun defaults are 128 and 512.

    In the project I posted about (an AMP cache tunnel), I'm using SetWindowSize(1024, 1024). Actually, I have measurements testing that change separately from increasing the smux buffers:

    direct thru cache condition (cumulative)
    329.0 260.5 baseline
    477.4 416.9 SetWindowSize(32, 32)(1024, 1024)
    1388.3 1405.8 MaxStreamBuffer = 64*10241*1024*1024
    MaxReceiveBuffer = 4*1024*102416*1024*1024 (server only)
  • David Fifield added 45m of time spent

    added 45m of time spent

  • Ah, I think you're right. Hmm, I wonder why I wasn't getting any noticeable difference in download speed by changing the MaxStreamBuffer then (because, if I understand correctly, changing it at the client should be enough to affect downloads as opposed to uploads). I'll take a closer look at this again.

  • In experimenting with increasing the buffer sizes in dnstt, I found that another parameter that matters for tuning is the turbotunnel queueSize, which is the size of internal buffering for QueuePacketConn, currently 32. If it's too small on the sending size, packets get dropped inside QueuePacketConn and have to be retransmitted later. You can check whether drops are happening with the following patch. For me, there were drops in sending on the server side.

    diff --git a/common/turbotunnel/queuepacketconn.go b/common/turbotunnel/queuepacketconn.go
    index 14a9833..ede3433 100644
    --- a/common/turbotunnel/queuepacketconn.go
    +++ b/common/turbotunnel/queuepacketconn.go
    @@ -1,6 +1,7 @@
     package turbotunnel
     
     import (
    +	"log"
     	"net"
     	"sync"
     	"sync/atomic"
    @@ -57,6 +58,7 @@ func (c *QueuePacketConn) QueueIncoming(p []byte, addr net.Addr) {
     	select {
     	case c.recvQueue <- taggedPacket{buf, addr}:
     	default:
    +		log.Printf("drop incoming %d", len(buf))
     		// Drop the incoming packet if the receive queue is full.
     	}
     }
    @@ -99,6 +101,7 @@ func (c *QueuePacketConn) WriteTo(p []byte, addr net.Addr) (int, error) {
     		return len(buf), nil
     	default:
     		// Drop the outgoing packet if the send queue is full.
    +		log.Printf("drop outgoing %d", len(buf))
     		return len(buf), nil
     	}
     }

    In some quick tests, I started to get faster results in dnstt when I tried

    • queueSize = 2048 (was 32)
    • SetWindowSize(1024, 1024) (was 32)
    • MaxStreamBuffer = 4 * 1024 * 1024 (was 65536)

    It was only a little slower with SetWindowSize(128, 128). I imagine there could be drawbacks to an overly large queueSize or SetWindowSize—when a packet needs to be retransmitted it goes to the end of the queue. For reference, the settings I was using when I first noticed the potential of increasing MaxStreamBuffer in the AMP cache tunnel were:

    • queueSize = 256
    • SetWindowSize(1024, 1024)
    • MaxStreamBuffer = 1 * 1024 * 1024
  • Cecylia Bocovich added 7 commits

    added 7 commits

    • 1470b25b...e3d376ca - 6 commits from branch tpo/anti-censorship/pluggable-transports:main
    • b5b5334f - Increase smux and QueuePacketConn buffer sizes

    Compare with previous version

  • I spent some time on Friday and yesterday trying out the new Shadow release to see if it would work with Go yet. Turns out there's still some work that needs to be done to support Go binaries, but I'm hopeful that this will be useful for doing more in-depth experiments with Snowflake.

    In the nearer term, I don't see an issue with increasing the queueSize and MaxStreamBuffer as suggested here at the server and trying it out. I just pushed an update to the merge request.

  • David Fifield approved this merge request

    approved this merge request

  • Okay, let's go ahead with this.

    With a large queueSize, let's watch out for bufferbloat effects. What may happen is that when a connection is interrupted, the buffer fills up with packets that cannot be sent yet, possibly including retransmissions of packets already in the buffer. All these old stale packets will need to be flushed out of the buffer before anything new can be sent. I was thinking of instituting some kind of random early detection drops into turbotunnel.QueuePacketConn to mitigate these kinds of effects.

  • I just merged and deployed this. Thanks for pointing out the potential buffer bloat issue. If I understand this correctly, two ways of monitoring this would be:

    • in packet captures for clients that have interrupted connections, we'll see a lot of duplicate retransmissions following the outage and a longer than usual recovery time
    • increased memory usage at the server

    These sound difficult to actively monitor, we might just need to run a few intentional experiments to determine whether this is an issue.

    The random early detection is interesting, thanks for the link!

    Edited by Cecylia Bocovich
  • Cecylia Bocovich mentioned in issue #40026

    mentioned in issue #40026

  • mentioned in issue #40086 (closed)

  • mentioned in issue #40179 (closed)

Please register or sign in to reply
Loading