Upgrade memory capacity of snowflake bridge
The snowflake server went offline again yesterday. The system log shows it was the result of the OOM killer again:
Jun 12 11:12:42 snowflake kernel: [6779302.156054] proxy-go invoked oom-killer: gfp
_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Jun 12 11:12:42 snowflake kernel: [6779302.332733] oom-kill:constraint=CONSTRAINT_N
ONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/sys
tem-tor.slice/tor@default.service,task=snowflake-serve,pid=7007,uid=106
Jun 12 11:12:42 snowflake kernel: [6779302.338121] Out of memory: Killed process 7007 (snowflake-serve) total-vm:1279524kB, anon-rss:249376kB, file-rss:0kB, shmem-rss:0kB, UID:106 pgtables:652kB oom_score_adj:0
Jun 12 11:12:42 snowflake kernel: [6779302.350275] oom_reaper: reaped process 7007 (snowflake-serve), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Looking at the memory in use by the server and proxies when the oom killer was triggered:
Jun 12 11:12:42 snowflake kernel: [6779302.268957] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name
Jun 12 11:12:42 snowflake kernel: [6779302.315899] [ 10513] 108 10513 621 21 45056 0 0 timeout
Jun 12 11:12:42 snowflake kernel: [6779302.317993] [ 10515] 108 10515 175983 3057 196608 0 0 proxy-go
Jun 12 11:12:42 snowflake kernel: [6779302.319553] [ 25211] 108 25211 621 21 45056 0 0 timeout
Jun 12 11:12:42 snowflake kernel: [6779302.321025] [ 25212] 108 25212 157550 3036 176128 0 0 proxy-go
Jun 12 11:12:42 snowflake kernel: [6779302.322516] [ 7006] 106 7006 131267 57233 946176 0 0 tor
Jun 12 11:12:42 snowflake kernel: [6779302.324228] [ 7007] 106 7007 319881 62344 667648 0 0 snowflake-serve
Jun 12 11:12:42 snowflake kernel: [6779302.327736] [ 8724] 108 8724 621 23 49152 0 0 timeout
Jun 12 11:12:42 snowflake kernel: [6779302.329365] [ 8725] 108 8725 175983 2115 180224 0 0 proxy-go
It doesn't look to me like the snowflake server is using up a huge amount of memory here. @dcf can we easily increase the memory for this machine? I might also brought down one of the proxy-go instances just to see if that gives us some more room.
Edited by David Fifield