Snowflake issueshttps://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues2021-05-28T16:55:08Zhttps://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40042Client Transport.Dial fails to release resources on error2021-05-28T16:55:08ZDavid Fifielddcf@torproject.orgClient Transport.Dial fails to release resources on errorThe client [`lib.Handler`](https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/blob/eff73c3016ec259918e117665833df04f1755e80/client/lib/snowflake.go#L99) function does not close `pconn` and `sess` in the ev...The client [`lib.Handler`](https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/blob/eff73c3016ec259918e117665833df04f1755e80/client/lib/snowflake.go#L99) function does not close `pconn` and `sess` in the event that `sess.OpenStream()` returns an error. This is a minor resource leak.
The `pconn.Close()` and `sess.Close()` statements should be moved up and placed in a `defer`.
This is issue UCB-02-001 from the 2021 security audit of Turbo Tunnel by Cure53. Quoted below:
> ### UCB-02-001 WP1: Memory leak in Handler() routine of Snowflake client lib (Low)
>
> During a review of the Snowflake client library, the discovery was made that the
*Handler()* function - responsible for establishing a WebRTC connection to the remote
peer - does not correctly close the connection and established smux session in the
eventuality that a stream cannot be opened. This could result in a memory leak on the
Snowflake client side, as well as a resource leak on the server side of the connection.
>
> **Affected file:**
> *snowflake/client/lib/snowflake.go*
>
> **Affected code:**
> ```
> func Handler(socks net.Conn, tongue Tongue) error {
> [...]
> // Create a new smux session
> log.Printf("---- Handler: starting a new session ---")
> pconn, sess, err := newSession(snowflakes)
> if err != nil {
> return err
> }
> // On the smux session we overlay a stream.
> stream, err := sess.OpenStream()
> if err != nil {
> return err
> }
> [...]
> }
> ```
>
> It is recommended to close all open connections using [*defer*](https://tour.golang.org/flowcontrol/12) in order to properly
alleviate all allocated resources when the function returns.
----
!31 restructured things and this issue now applies to [`Transport.Dial`](https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/blob/ef4d0a1da56e15327173923fa14a28d9ca40789c/client/lib/snowflake.go#L76). If `newSession` fails, then `snowflakes` leaks; and if `sess.OpenStream` fails, then `snowflakes`, `pconn`, and `sess` leak. There's also an error at the [call site](https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/blob/ef4d0a1da56e15327173923fa14a28d9ca40789c/client/snowflake.go#L73): any error is logged, but then the code goes on to call `copyLoop` with the `nil` `Conn` that was returned.David Fifielddcf@torproject.orgDavid Fifielddcf@torproject.orghttps://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40041update the torrc file in snowflake/client2021-06-24T18:44:06Ztoralfupdate the torrc file in snowflake/clientOtherwise it yields at a system with an already configured and running Tor client to
```
tfoerste@t44 ~/devel/go/src/snowflake/client $ tor -f torrc
Apr 19 18:55:47.450 [notice] Tor 0.4.7.0-alpha-dev (git-e7c407d927c80a94) running on Lin...Otherwise it yields at a system with an already configured and running Tor client to
```
tfoerste@t44 ~/devel/go/src/snowflake/client $ tor -f torrc
Apr 19 18:55:47.450 [notice] Tor 0.4.7.0-alpha-dev (git-e7c407d927c80a94) running on Linux with Libevent 2.1.11-stable, OpenSSL 1.1.1k, Zlib 1.2.11, Liblzma 5.2.5, Libzstd 1.4.9 and Glibc 2.32 as libc.
Apr 19 18:55:47.450 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
Apr 19 18:55:47.450 [notice] This version is not a stable Tor release. Expect more bugs than usual.
Apr 19 18:55:47.450 [notice] Read configuration file "/home/tfoerste/devel/go/src/snowflake/client/torrc".
Apr 19 18:55:47.451 [warn] Path for DataDirectory (datadir) is relative and will resolve to /home/tfoerste/devel/go/src/snowflake/client/datadir. Is this what you wanted?
Apr 19 18:55:47.453 [notice] Opening Socks listener on 127.0.0.1:9050
Apr 19 18:55:47.453 [warn] Could not bind to 127.0.0.1:9050: Address already in use. Is Tor already running?
Apr 19 18:55:47.453 [warn] Failed to parse/validate config: Failed to bind one of the listener ports.
Apr 19 18:55:47.453 [err] Reading config failed--see warnings above.
```https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40040Prevent more than one snowflake in the same network2021-04-12T20:09:45ZcypherpunksPrevent more than one snowflake in the same network"[You should not](https://mastodon.social/@torproject/105816673233457564) run more than one snowflake in the same network."
Can that be prevented programmatically? Operators who run a snowflake in a browser don't always know what else i..."[You should not](https://mastodon.social/@torproject/105816673233457564) run more than one snowflake in the same network."
Can that be prevented programmatically? Operators who run a snowflake in a browser don't always know what else is running on the network.
Asked by: https://blog.torproject.org/comment/291286#comment-291286https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40039probetest is spinning with 100% CPU2021-10-28T16:35:01ZDavid Fifielddcf@torproject.orgprobetest is spinning with 100% CPUI just now (2021-04-05 16:22:49) noticed that probetest (#40013) on the broker is using 100% CPU:
```
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1486 root 20 0 765744 239348 9832 S 98.3 5.9...I just now (2021-04-05 16:22:49) noticed that probetest (#40013) on the broker is using 100% CPU:
```
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1486 root 20 0 765744 239348 9832 S 98.3 5.9 91550:09 probetest
```
Judging by the CPU time of 91500 minutes and 9 seconds, it has been like this for about (91550 * 60 + 9) / 3600. / 24 = 63 days.meskiomeskio@torproject.orgmeskiomeskio@torproject.orghttps://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40038Upgrade bridge to OpenSSL 1.1.1k for CVE-2021-34492021-03-26T13:31:03ZDavid Fifielddcf@torproject.orgUpgrade bridge to OpenSSL 1.1.1k for CVE-2021-3449https://lists.torproject.org/pipermail/tor-relays/2021-March/019442.html
> There is a new version of OpenSSL out today, with a security advisory
that affects Tor. The vulnerability is CVE-2021-3449, as described on
https://www.openssl....https://lists.torproject.org/pipermail/tor-relays/2021-March/019442.html
> There is a new version of OpenSSL out today, with a security advisory
that affects Tor. The vulnerability is CVE-2021-3449, as described on
https://www.openssl.org/news/secadv/20210325.txt . It affects OpenSSL
versions 1.1.1 through 1.1.1j. OpenSSL 1.1.1k is the first version
with a fix.
>
> I haven't tested this bug, but I believe that it would allow an
adversary to remotely crash Tor relays and authorities. It won't have
any effect on Tor clients.
>
> I suggest that everybody should upgrade to the latest OpenSSL when it
becomes available on their platform.Cecylia BocovichCecylia Bocovichhttps://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40037Update version of pion webrtc to fix CVE-2021-286812021-04-01T13:21:09ZCecylia BocovichUpdate version of pion webrtc to fix CVE-2021-28681This was patched in v3.0.15This was patched in v3.0.15Cecylia BocovichCecylia Bocovichhttps://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40035Snowflake doesn't close OR connections2021-03-19T02:10:31ZCecylia BocovichSnowflake doesn't close OR connectionsI was staring at the snowflake logs after #40033 happened, and noticed a lot more `io: read/write on closed pipe` style errors than I was expecting. Looking at our proxy loop:
```
// Copy from one stream to another.
func proxy(local *net...I was staring at the snowflake logs after #40033 happened, and noticed a lot more `io: read/write on closed pipe` style errors than I was expecting. Looking at our proxy loop:
```
// Copy from one stream to another.
func proxy(local *net.TCPConn, conn net.Conn) {
var wg sync.WaitGroup
wg.Add(2)
go func() {
if _, err := io.Copy(conn, local); err != nil {
log.Printf("error copying ORPort to WebSocket %v", err)
}
if err := local.CloseRead(); err != nil {
log.Printf("error closing read after copying ORPort to WebSocket %v", err)
}
conn.Close()
wg.Done()
}()
go func() {
if _, err := io.Copy(local, conn); err != nil {
log.Printf("error copying WebSocket to ORPort %v", err)
}
if err := local.CloseWrite(); err != nil {
log.Printf("error closing write after copying WebSocket to ORPort %v", err)
}
conn.Close()
wg.Done()
}()
wg.Wait()
}
```
If the client closes the connection, the bottom io.Copy will terminate and cause the other one to generate the error. If the OR connection closes: vice versa. However, when the client closes the connection, the bottom loop doesn't terminate because the bottom loop is reading from a KCP stream, not the WebSocket connection. So neither of these loops terminate until the OR connection times out (~20 minutes in a local test).
I'm not actually sure if this is bug or if it could cause performance problems. It means we're keeping goroutines and open connections around for longer than they need to be, but as far as I can tell we're not running out of memory on the bridge. At the very least it means that `local.CloseRead` and `local.CloseWrite` aren't doing anything here and will always generate errors.https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40034Upgrade tor on Snowflake bridge for TROVE-2021-001 and TROVE-2021-002 (2021-0...2021-03-16T22:03:05ZDavid Fifielddcf@torproject.orgUpgrade tor on Snowflake bridge for TROVE-2021-001 and TROVE-2021-002 (2021-03-16)[Upcoming releases next week to fix denial-of-service bugs in Tor](https://lists.torproject.org/pipermail/tor-talk/2021-March/045711.html)
> Early next week -- around Tuesday -- we plan to put out new Tor
releases to fix a pair of denia...[Upcoming releases next week to fix denial-of-service bugs in Tor](https://lists.torproject.org/pipermail/tor-talk/2021-March/045711.html)
> Early next week -- around Tuesday -- we plan to put out new Tor
releases to fix a pair of denial-of-service issues that we have found.
> We are tracking these issues as "High" and "Medium" severity
respectively under our security policy at
https://gitlab.torproject.org/tpo/core/team/-/wikis/NetworkTeam/SecurityPolicy
> * We are tracking these issues as TROVE-2021-001 and TROVE-2021-002
at https://gitlab.torproject.org/tpo/core/team/-/wikis/NetworkTeam/TROVE
> * All currently supported Tor versions are affected.
>
> The impact of these issues is that a remote attacker participating in
the directory protocol can cause a denial of service attack against
Tor instances. Once the new versions are released, we will recommend
that all relays and authorities should upgrade. The impact is worst
for directory authorities: we have already distributed patches to the
authority operators and encouraged them to upgrade.
>
> To the best of our knowledge these vulnerabilities are not being
exploited in the wild.
>
> We'll be releasing more information about these issues after the fixes
are available.https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40033Fix goroutine leak in snowflake server2021-06-18T18:10:19ZCecylia BocovichFix goroutine leak in snowflake serverGot a monit alert this morning that the snowflake bridge is down. I tried resetting my webextension and got a `Could not connect to the bridge.` error.Got a monit alert this morning that the snowflake bridge is down. I tried resetting my webextension and got a `Could not connect to the bridge.` error.Cecylia BocovichCecylia Bocovichhttps://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40032Document go1.13+ requirement2021-03-08T16:05:22ZJacobo NájeraDocument go1.13+ requirementHi,
I am trying to install snowflake standalone. But I have an issue with module crypto/ed25519`
**Steps to reproduce**
```
git clone https://git.torproject.org/pluggable-transports/snowflake.git
cd snowflake
cd proxy
go get
```
**...Hi,
I am trying to install snowflake standalone. But I have an issue with module crypto/ed25519`
**Steps to reproduce**
```
git clone https://git.torproject.org/pluggable-transports/snowflake.git
cd snowflake
cd proxy
go get
```
**Result**
`build git.torproject.org/pluggable-transports/snowflake.git/proxy: cannot find module for path crypto/ed25519`
**Setup**
- Debian 10
- go version go1.11.6 linux/amd64https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40031Improve instructions on how to to set up snowflake standalone proxy and add t...2022-12-04T16:44:53ZLhayamImprove instructions on how to to set up snowflake standalone proxy and add to Tor community portalSnowflake standalone proxy is such an effortless and easy way to contribute, it deserves more straight forward and more prominently placed documentation I think. Contrary to the web- and browser extension, it also covers volunteers which...Snowflake standalone proxy is such an effortless and easy way to contribute, it deserves more straight forward and more prominently placed documentation I think. Contrary to the web- and browser extension, it also covers volunteers which are running headless, or are just hesitant to run WebRTC in their browsers.
--- edit ---
Adding a checklist here so we can keep track of the work that should be done:
- [x] Update https://snowflake.torproject.org to point to community documentation
- [ ] The building from source instructions are not clear
- [ ] include instructions for go1.11
- [x] Update the README for `/proxy` in the git repository to include build instructions and point to community pages for running the proxy
- [ ] Some info for how to keep the proxy up to date (this actually should be discussed and completed for #32677)
- [ ] Some instructions on checking Snowflake logs
- [ ] How to check that it is working
- [ ] How to see how much bandwidth it is using
--- end edit ---
Couple of points/ideas:
The existing instructions on how to set up the GOPATH will leave many users with a more or less broken system, since the majority of distros don't use `bash_profile`, but `.profile`. After creating `.bash_profile`, `.profile` will [not be read any more](https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Bash-Startup-Files), and all what's configured after copy- and pasting the commands, is the GOPATH, which is ..bad.
On every semi updated installation, setting the GOPATH is not even necessary, beginning with the release of go-1.8 (2017) it will default to $HOME/go when no other path is specified. Since a lot of users will run go for the first time when setting up snowflake proxy, checking if $GOPATH exists and if it doesn't, exporting the GOPATH _temporarily_ for the shell might be a good idea still (?).
Use variables, like `mkdir -p "$GOPATH/src"`, to make the guide more universal.
How about offering a simple setup.sh (maybe that's another ticket though)?
Everybody who has some understanding of their system will know a way how to auto start snowflake at boot, there's no reason to alienate less technical inclined users with torproject's specific runit configuration.
Tell the user where to run the command to start the proxy. At least on go versions < 1.16 (Q1 2020, Debian stable is on 1.11 for reference), neither `$GOPATH/bin` nor `$GOPATH/src/snowflake/proxy/proxy` is in PATH by default, so just executing `nohup ./proxy &` will blatantly fail.
Tell the user how to save logs, since lack of feedback whether snowflake is doing as intended or not might discourage users.
Provide instructions on how to update. Personally I've a start-up script running which checks for updates every time snowflake is started, but even a manual approach will do.https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40029Android CI is broken2021-03-25T20:15:58ZCecylia BocovichAndroid CI is brokenThis is mostly a reminder to myself to fix the android CI when a new version of Go comes out.
It looks like [there's a fix merged](https://github.com/golang/go/issues/42655) but won't be available until Go 1.15.7This is mostly a reminder to myself to fix the android CI when a new version of Go comes out.
It looks like [there's a fix merged](https://github.com/golang/go/issues/42655) but won't be available until Go 1.15.7https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40028Make *.freehaven.net domains be CNAMEs for *.torproject.net, not *.bamsoftwar...2022-10-26T17:36:46ZDavid Fifielddcf@torproject.orgMake *.freehaven.net domains be CNAMEs for *.torproject.net, not *.bamsoftware.comIn #31250 we changed snowflake.bamsoftware.com and snowflake-broker.bamsoftware.com to snowflake.freehaven.net and snowflake-broker.freehaven.net in the browser extension, in order to avoid malware warnings from bamsoftware.com. But the ...In #31250 we changed snowflake.bamsoftware.com and snowflake-broker.bamsoftware.com to snowflake.freehaven.net and snowflake-broker.freehaven.net in the browser extension, in order to avoid malware warnings from bamsoftware.com. But the freehaven.net domains are CNAMEs for the corresponding bamsoftware.com domains, which apparently still triggers some malware detection systems:
* [Is bamsoftware.com related to the TOR project in any official way?](https://www.reddit.com/r/TOR/comments/kj7git/is_bamsoftwarecom_related_to_the_tor_project_in/)
> I recently installed [Snowflake on Firefox](https://addons.mozilla.org/en-US/firefox/addon/torproject-snowflake/). Whenever I turn it on, Malwarebytes blocks snowflake.bamsoftware.com and snowflake-broker.bamsoftware.com because of a Trojan.
* @arma reports that someone he knows has also seen antivirus warnings when running the browser extension.
We diagnosed the problem in the [2020-01-07 anti-censorship team meeting](http://meetbot.debian.net/tor-meeting/2021/tor-meeting.2021-01-07-15.58.log.html#l-15)
> 16:09:15 <phw> malwarebytes may be looking at dns reqs without considering the semantics of a cname, in which case it always sees the bamsoftware domain<br/>
> 16:10:49 <phw> i just tested with wireshark: i see bamsoftware.com in my dns responses when i turn snowflake on
The solution we arrived at is to make the freehaven.net domains be CNAMEs for the corresponding torproject.net domains, which are plain A records and do not refer to bamsoftware.com. To be specific, we need to change this:
```
snowflake-broker IN CNAME snowflake-broker.bamsoftware.com.
snowflake IN CNAME snowflake.bamsoftware.com.
```
to this:
```
snowflake-broker IN CNAME snowflake-broker.torproject.net.
snowflake IN CNAME snowflake.torproject.net.
```Roger DingledineRoger Dingledinehttps://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40027Update webrtc version to v3.0.02021-01-12T15:39:19ZCecylia BocovichUpdate webrtc version to v3.0.0A new major version of pion/webrtc came out.
Here are the release notes: https://github.com/pion/webrtc/wiki/Release-WebRTC@v3.0.0
We should try an update and see how it affects throughput.A new major version of pion/webrtc came out.
Here are the release notes: https://github.com/pion/webrtc/wiki/Release-WebRTC@v3.0.0
We should try an update and see how it affects throughput.Cecylia BocovichCecylia Bocovichhttps://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40026Tune reliable protocol parameters for Snowflake2024-02-27T18:43:45ZCecylia BocovichTune reliable protocol parameters for SnowflakeI did some diving into the KCP code and played around with the parameters because of the throughput issues we're having with #25723. Even without multiplexing I was able to drastically increase the throughput of snowflake. It looks like ...I did some diving into the KCP code and played around with the parameters because of the throughput issues we're having with #25723. Even without multiplexing I was able to drastically increase the throughput of snowflake. It looks like the bottleneck was preventing us from even using the full capacity of a single snowflake, let alone allowing for improvements by splitting across several.
KCP sends data in segments, and will only send a maximum number of segments before the other side acknowledges them. The default window size is 32 segments. Also by default, there is a 1:1 mapping between segments and a call to `Send` on the KCP connection.
#### SetWindowSize
I tried doubling window size from 32 segments to 64 and got pretty much a doubling of bandwidth from a single snowflake in my local test environment. We need to set this at both the client and server for it to have any affect because the effective window size is the minimum of the sender's send window and the receiver's advertized receive window.
#### SetStreamMode
Stream mode causes the sender to fragment segments. A call to `Send` will append data to the previous segment, up to the maximum segment size. I was able to more than double throughput by setting this. This has to be set on the sender's side, so assuming most of our bandwidth needs are download speeds it should be set at the server.https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40025Fetch multiple snowflakes from the broker at once2022-11-07T20:34:35ZCecylia BocovichFetch multiple snowflakes from the broker at onceTo reduce the number of connections we make to the broker and speed up the collection of snowflakes so that we can split traffic across all of them for #25723, we should be able to request multiple snowflakes from the broker at once.To reduce the number of connections we make to the broker and speed up the collection of snowflakes so that we can split traffic across all of them for #25723, we should be able to request multiple snowflakes from the broker at once.https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40024Expose broker metrics for Prometheus?2022-07-28T11:15:45ZPhilipp Winterphw@torproject.orgExpose broker metrics for Prometheus?We are [already exposing metrics](https://snowflake-broker.torproject.net/metrics) but we don't have convenient tooling that turns these metrics into charts that are easy to explore and update automatically.
We could solve this problem ...We are [already exposing metrics](https://snowflake-broker.torproject.net/metrics) but we don't have convenient tooling that turns these metrics into charts that are easy to explore and update automatically.
We could solve this problem by exposing these metrics in a format that our Prometheus instance can scrape. I recently did that for bridgestrap, over at tpo/anti-censorship/bridgestrap#4. It's not a lot of work but the downside is pulling in yet another semi-complex dependency.
If we were to implement this, here's how it would work:
* The broker serves a new page, e.g. /prometheus-metrics
* We can use Prometheus's Go client library to deal with metrics.
* Throughout the code, we can update metrics like this:
```go
metrics.NumUnrestrictedNatProxies.Inc()
```
Does the above sound sensible? If so, I can implement a prototype.Cecylia BocovichCecylia Bocovichhttps://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40022Redefine restricted NAT designation to include only symmetric NATs2020-11-20T16:09:53ZCecylia BocovichRedefine restricted NAT designation to include only symmetric NATsSee https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/wikis/NAT-matching for an overview of NAT types and compatibility.
Right now we classify a client NAT as "restricted" if it is symmetrically NAT'd (i...See https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/wikis/NAT-matching for an overview of NAT types and compatibility.
Right now we classify a client NAT as "restricted" if it is symmetrically NAT'd (i.e., has an (AO,X) or (AP,X) NAT mapping and filtering behaviour) and also if it has an aggressive filter (AI, AP).
As mentioned as a footnote on the NAT matching wiki page, most (AI,AP) clients will actually work with most (> 80% of) restricted proxies. It's also the case, judging by recent snowflake CollecTor stats, that the majority of clients are (AI,AP). Right now these clients are drawing from the **very small** unrestricted proxies bucket and depleting this resource. I'd like to reclassify these clients as unrestricted so they can pull from the restricted proxy pool. Getting a failed proxy ~20% of the time shouldn't be an issue, especially with the future plans to multiplex in #25723.
This also means that "restricted" is defined differently for clients vs. proxies. (AI,AP) is a restricted proxy, but an unrestricted client.
To do this, I'd like to make the following changes:
- [ ] Have standalone proxies do the probe test instead of using the RFC 5780 method to determine NAT behaviour
- [ ] Redefine (AI,AP) to be an unrestricted NAT typeCecylia BocovichCecylia Bocovichhttps://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40021Break from infinite client-broker poll after call to snowflakes.End()2020-11-23T17:12:51ZCecylia BocovichBreak from infinite client-broker poll after call to snowflakes.End()This bug was originally brought to my attention in https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40018#note_2715140 but it's a more general issue than just for snowflake on mobile devices.
The...This bug was originally brought to my attention in https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40018#note_2715140 but it's a more general issue than just for snowflake on mobile devices.
The following for loop in [client/lib/webrtc.go](https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/blob/61beb9d996527cd8cb9e4ca650f8cbf24df1503e/client/lib/webrtc.go#L224) will poll infinitely until it receives an available snowflake:
```
// exchangeSDP sends the local SDP offer to the Broker, awaits the SDP answer,
// and returns the answer.
func exchangeSDP(broker *BrokerChannel, offer *webrtc.SessionDescription) *webrtc.SessionDescription {
// Keep trying the same offer until a valid answer arrives.
for {
// Send offer to broker (blocks).
answer, err := broker.Negotiate(offer)
if err == nil {
return answer
}
log.Printf("BrokerChannel Error: %s", err)
log.Printf("Failed to retrieve answer. Retrying in %v", ReconnectTimeout)
<-time.After(ReconnectTimeout)
}
}
```
Because of ongoing work on matching up clients with compatible proxies in snowflake#40013, we currently have almost no "unrestricted" proxies available, meaning snowflake clients behind restricted NATs have to poll for a very long time before getting a snowflake. Normally the potentially infinite polling isn't a problem as long as there are snowflakes available, but if the client makes a call to `snowflakes.End()` we *should* stop polling regardless of whether or not snowflakes are available.https://gitlab.torproject.org/tpo/anti-censorship/pluggable-transports/snowflake/-/issues/40020Snowflake server died and had to be restarted2021-05-20T19:46:07ZCecylia BocovichSnowflake server died and had to be restartedI checked the logs, and the only clue I have to this mystery are from /var/log/syslog:
```
Nov 8 00:53:15 snowflake kernel: [19782731.564416] tor invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_
adj=0
No...I checked the logs, and the only clue I have to this mystery are from /var/log/syslog:
```
Nov 8 00:53:15 snowflake kernel: [19782731.564416] tor invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_
adj=0
Nov 8 00:53:15 snowflake kernel: [19782731.564494] CPU: 0 PID: 10912 Comm: tor Not tainted 5.4.20 #8
Nov 8 00:53:15 snowflake kernel: [19782731.564536] Call Trace:
Nov 8 00:53:15 snowflake kernel: [19782731.564571] dump_stack+0x66/0x8b
Nov 8 00:53:15 snowflake kernel: [19782731.564603] dump_header+0x42/0x200
Nov 8 00:53:15 snowflake kernel: [19782731.564635] oom_kill_process+0xe9/0x110
Nov 8 00:53:15 snowflake kernel: [19782731.564668] out_of_memory+0xfa/0x4a0
Nov 8 00:53:15 snowflake kernel: [19782731.564701] __alloc_pages_slowpath+0x91c/0xca0
Nov 8 00:53:15 snowflake kernel: [19782731.564740] __alloc_pages_nodemask+0x249/0x280
Nov 8 00:53:15 snowflake kernel: [19782731.564777] pagecache_get_page+0xbb/0x220
Nov 8 00:53:15 snowflake kernel: [19782731.564810] filemap_fault+0x56e/0x860
Nov 8 00:53:15 snowflake kernel: [19782731.564842] ? page_add_file_rmap+0x135/0x170
Nov 8 00:53:15 snowflake kernel: [19782731.564892] ? alloc_set_pte+0x17c/0x5e0
Nov 8 00:53:15 snowflake kernel: [19782731.564929] ? xas_load+0x9/0x80
Nov 8 00:53:15 snowflake kernel: [19782731.564974] ? xas_find+0x192/0x1d0
Nov 8 00:53:15 snowflake kernel: [19782731.565005] ? filemap_map_pages+0x16f/0x360
Nov 8 00:53:15 snowflake kernel: [19782731.565062] __xfs_filemap_fault.constprop.24+0x37/0xc0 [xfs]
Nov 8 00:53:15 snowflake kernel: [19782731.565104] __do_fault+0x4a/0x8e
Nov 8 00:53:15 snowflake kernel: [19782731.565135] __handle_mm_fault+0xbc2/0x1110
Nov 8 00:53:15 snowflake kernel: [19782731.565169] handle_mm_fault+0xe7/0x1d0
Nov 8 00:53:15 snowflake kernel: [19782731.566305] __do_page_fault+0x1f5/0x4c0
Nov 8 00:53:15 snowflake kernel: [19782731.567444] page_fault+0x34/0x40
Nov 8 00:53:15 snowflake kernel: [19782731.568571] RIP: 0033:0x55b72cdd7bd0
Nov 8 00:53:15 snowflake kernel: [19782731.569707] Code: Bad RIP value.
Nov 8 00:53:15 snowflake kernel: [19782731.570796] RSP: 002b:00007ffc9a8864e8 EFLAGS: 00010246
Nov 8 00:53:15 snowflake kernel: [19782731.571875] RAX: 0000000000000004 RBX: 000055b7343027c0 RCX: 0000000000000000
Nov 8 00:53:15 snowflake kernel: [19782731.573130] RDX: 0000000000000001 RSI: 000000005fa7417a RDI: 000055b7343027c0
Nov 8 00:53:15 snowflake kernel: [19782731.574265] RBP: 0000000000000000 R08: 000000005fa7417a R09: 000000005fa805c8
Nov 8 00:53:15 snowflake kernel: [19782731.575437] R10: 0000000000000006 R11: 0000000000000000 R12: 0000000000000218
Nov 8 00:53:15 snowflake kernel: [19782731.576585] R13: 000000005fa7417a R14: 0000000000000000 R15: 000055b7343027c0
Nov 8 00:53:15 snowflake kernel: [19782731.577918] Mem-Info:
Nov 8 00:53:15 snowflake kernel: [19782731.579136] active_anon:366133 inactive_anon:16451 isolated_anon:0
Nov 8 00:53:15 snowflake kernel: [19782731.579136] active_file:16 inactive_file:37 isolated_file:0
Nov 8 00:53:15 snowflake kernel: [19782731.579136] unevictable:0 dirty:0 writeback:0 unstable:0
Nov 8 00:53:15 snowflake kernel: [19782731.579136] slab_reclaimable:1604 slab_unreclaimable:10856
Nov 8 00:53:15 snowflake kernel: [19782731.579136] mapped:485 shmem:18773 pagetables:1112 bounce:0
Nov 8 00:53:15 snowflake kernel: [19782731.579136] free:13180 free_pcp:534 free_cma:0
```
This machine should only run proxies and the bridge, it's possible there's a memory leak somewhere.