Frequent OOM kills of tor process
We are currently expierencing frequent Out of memory kills of or tor processes on two boxes running ubuntu 14.04. ``` root@tor1:~# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 14.04.5 LTS Release: 14.04 Codename: trusty ``` We experience this on following versions: 0.3.0.5-rc-1~trusty+1 and 0.2.9.10-1~trusty+1 I changed to the rc after the normal 2.9.10-1 Version keept having this problems. The 0.3.0.5-rc has the same problems. From syslog ``` May 14 10:21:42 tor1 kernel: [123517.565688] UDP: bad checksum. From 82.247.214.125:51413 to 176.10.104.240:21446 ulen 38 May 14 11:02:47 tor1 kernel: [125982.993268] tor invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0 May 14 11:02:47 tor1 kernel: [125982.993270] tor cpuset=/ mems_allowed=0 May 14 11:02:47 tor1 kernel: [125982.993272] CPU: 2 PID: 33180 Comm: tor Not tainted 3.13.0-117-generic !#164-Ubuntu May 14 11:02:47 tor1 kernel: [125982.993273] Hardware name: HP ProLiant ML310e Gen8 v2, BIOS P78 03/28/2014 May 14 11:02:47 tor1 kernel: [125982.993274]  0000000000000000 ffff8800a8679990 ffffffff8172d1c9 ffff88013c868000 May 14 11:02:47 tor1 kernel: [125982.993276]  0000000000000000 ffff8800a8679a18 ffffffff81727768 ffffffff8106a926 May 14 11:02:47 tor1 kernel: [125982.993278]  ffff8800a86799f0 ffffffff810cb27c 0000000001320122 0000000000000000 May 14 11:02:47 tor1 kernel: [125982.993279] Call Trace: May 14 11:02:47 tor1 kernel: [125982.993285]  [<ffffffff8172d1c9>] dump_stack+0x64/0x82 May 14 11:02:47 tor1 kernel: [125982.993286]  [<ffffffff81727768>] dump_header+0x7f/0x1f1 May 14 11:02:47 tor1 kernel: [125982.993289]  [<ffffffff8106a926>] ? put_online_cpus+0x56/0x80 May 14 11:02:47 tor1 kernel: [125982.993291]  [<ffffffff810cb27c>] ? rcu_oom_notify+0xcc/0xf0 May 14 11:02:47 tor1 kernel: [125982.993294]  [<ffffffff81156d81>] oom_kill_process+0x201/0x360 May 14 11:02:47 tor1 kernel: [125982.993297]  [<ffffffff812dde35>] ? security_capable_noaudit+0x15/0x20 May 14 11:02:47 tor1 kernel: [125982.993299]  [<ffffffff81157511>] out_of_memory+0x471/0x4b0 May 14 11:02:47 tor1 kernel: [125982.993301]  [<ffffffff8115d82c>] !__alloc_pages_nodemask+0xa6c/0xb90 May 14 11:02:47 tor1 kernel: [125982.993304]  [<ffffffff8119e25a>] alloc_pages_vma+0x9a/0x140 May 14 11:02:47 tor1 kernel: [125982.993307]  [<ffffffff811908eb>] read_swap_cache_async+0xeb/0x160 May 14 11:02:47 tor1 kernel: [125982.993308]  [<ffffffff811909f8>] swapin_readahead+0x98/0xe0 May 14 11:02:47 tor1 kernel: [125982.993320]  [<ffffffff8117e90b>] handle_mm_fault+0xa7b/0xf10 May 14 11:02:47 tor1 kernel: [125982.993322]  [<ffffffff81739344>] !__do_page_fault+0x184/0x560 May 14 11:02:47 tor1 kernel: [125982.993325]  [<ffffffff810a3af5>] ? set_next_entity+0x95/0xb0 May 14 11:02:47 tor1 kernel: [125982.993328]  [<ffffffff810135db>] ? !__switch_to+0x16b/0x4f0 May 14 11:02:47 tor1 kernel: [125982.993329]  [<ffffffff8173973a>] do_page_fault+0x1a/0x70 May 14 11:02:47 tor1 kernel: [125982.993331]  [<ffffffff81735a68>] page_fault+0x28/0x30 May 14 11:02:47 tor1 kernel: [125982.993350] Mem-Info: May 14 11:02:47 tor1 kernel: [125982.993351] Node 0 DMA per-cpu: May 14 11:02:47 tor1 kernel: [125982.993352] CPU    0: hi:    0, btch:   1 usd:   0 May 14 11:02:47 tor1 kernel: [125982.993353] CPU    1: hi:    0, btch:   1 usd:   0 May 14 11:02:47 tor1 kernel: [125982.993353] CPU    2: hi:    0, btch:   1 usd:   0 May 14 11:02:47 tor1 kernel: [125982.993354] CPU    3: hi:    0, btch:   1 usd:   0 May 14 11:02:47 tor1 kernel: [125982.993355] Node 0 DMA32 per-cpu: May 14 11:02:47 tor1 kernel: [125982.993356] CPU    0: hi:  186, btch:  31 usd:  25 May 14 11:02:47 tor1 kernel: [125982.993356] CPU    1: hi:  186, btch:  31 usd:   2 May 14 11:02:47 tor1 kernel: [125982.993357] CPU    2: hi:  186, btch:  31 usd:   0 May 14 11:02:47 tor1 kernel: [125982.993358] CPU    3: hi:  186, btch:  31 usd:  26 May 14 11:02:47 tor1 kernel: [125982.993358] Node 0 Normal per-cpu: May 14 11:02:47 tor1 kernel: [125982.993359] CPU    0: hi:  186, btch:  31 usd:   0 May 14 11:02:47 tor1 kernel: [125982.993360] CPU    1: hi:  186, btch:  31 usd:   0 May 14 11:02:47 tor1 kernel: [125982.993360] CPU    2: hi:  186, btch:  31 usd:   0 May 14 11:02:47 tor1 kernel: [125982.993361] CPU    3: hi:  186, btch:  31 usd:   0 May 14 11:02:47 tor1 kernel: [125982.993363] active_!anon:0 inactive_!anon:0 isolated_!anon:0 May 14 11:02:47 tor1 kernel: [125982.993363]  active_!file:141 inactive_!file:204 isolated_!file:0 May 14 11:02:47 tor1 kernel: [125982.993363]  !unevictable:0 !dirty:0 !writeback:0 !unstable:0 May 14 11:02:47 tor1 kernel: [125982.993363]  !free:20517 slab_!reclaimable:7176 slab_!unreclaimable:41948 May 14 11:02:47 tor1 kernel: [125982.993363]  !mapped:80 !shmem:0 !pagetables:692 !bounce:0 May 14 11:02:47 tor1 kernel: [125982.993363]  free_!cma:0 May 14 11:02:47 tor1 kernel: [125982.993365] Node 0 DMA !free:15344kB !min:268kB !low:332kB !high:400kB active_!anon:0kB inactive_!anon:0kB active_!file:4kB inactive_!file:28kB !unevictable:0kB isolated(anon):0kB isolated(file):0kB !present:15976kB !managed:15892kB !mlocked:0kB !dirty:0kB !writeback:8kB !mapped:36kB !shmem:0kB slab_!reclaimable:0kB slab_!unreclaimable:156kB kernel_!stack:0kB !pagetables:4kB !unstable:0kB !bounce:0kB free_!cma:0kB writeback_!tmp:0kB pages_!scanned:18 all_unreclaimable? no May 14 11:02:47 tor1 kernel: [125982.993368] lowmem_reserve[]: 0 2688 3771 3771 May 14 11:02:47 tor1 kernel: [125982.993370] Node 0 DMA32 !free:51080kB !min:46532kB !low:58164kB !high:69796kB active_!anon:0kB inactive_!anon:0kB active_!file:516kB inactive_!file:788kB !unevictable:0kB isolated(anon):0kB isolated(file):0kB !present:2832272kB !managed:2753204kB !mlocked:0kB !dirty:0kB !writeback:0kB !mapped:284kB !shmem:0kB slab_!reclaimable:15612kB slab_!unreclaimable:83432kB kernel_!stack:312kB !pagetables:2064kB !unstable:0kB !bounce:0kB free_!cma:0kB writeback_!tmp:0kB pages_!scanned:17 all_unreclaimable? no May 14 11:02:47 tor1 kernel: [125982.993372] lowmem_reserve[]: 0 0 1082 1082 May 14 11:02:47 tor1 kernel: [125982.993373] Node 0 Normal !free:15644kB !min:18732kB !low:23412kB !high:28096kB active_!anon:0kB inactive_!anon:0kB active_!file:44kB inactive_!file:0kB !unevictable:0kB isolated(anon):0kB isolated(file):0kB !present:1179644kB !managed:1108484kB !mlocked:0kB !dirty:0kB !writeback:0kB !mapped:0kB !shmem:0kB slab_!reclaimable:13092kB slab_!unreclaimable:84204kB kernel_!stack:2072kB !pagetables:700kB !unstable:0kB !bounce:0kB free_!cma:0kB writeback_!tmp:0kB pages_!scanned:203 all_unreclaimable? yes May 14 11:02:47 tor1 kernel: [125982.993376] lowmem_reserve[]: 0 0 0 0 May 14 11:02:47 tor1 kernel: [125982.993377] Node 0 DMA: 3*4kB (M) 3*8kB (UM) 2*16kB (UM) 0*32kB 1*64kB (U) 1*128kB (M) 1*256kB (M) 1*512kB (M) 2*1024kB (UM) 2*2048kB (MR) 2*4096kB (M) = 15364kB May 14 11:02:47 tor1 kernel: [125982.993384] Node 0 DMA32: 303*4kB (UE) 662*8kB (UEM) 418*16kB (UEM) 952*32kB (UEM) 51*64kB (UM) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB (R) = 51148kB May 14 11:02:47 tor1 kernel: [125982.993394] Node 0 Normal: 375*4kB (UEM) 330*8kB (UEM) 649*16kB (UEM) 3*32kB (M) 2*64kB (R) 1*128kB (R) 1*256kB (R) 1*512kB (R) 0*1024kB 0*2048kB 0*4096kB = 15644kB May 14 11:02:47 tor1 kernel: [125982.993401] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB May 14 11:02:47 tor1 kernel: [125982.993401] 385 total pagecache pages May 14 11:02:47 tor1 kernel: [125982.993402] 20 pages in swap cache May 14 11:02:47 tor1 kernel: [125982.993403] Swap cache stats: add 2432647, delete 2432627, find 700250/1142133 May 14 11:02:47 tor1 kernel: [125982.993404] Free swap  = 3627940kB May 14 11:02:47 tor1 kernel: [125982.993404] Total swap = 4026364kB May 14 11:02:47 tor1 kernel: [125982.993405] 1006973 pages RAM May 14 11:02:47 tor1 kernel: [125982.993405] 0 pages HighMem/MovableOnly May 14 11:02:47 tor1 kernel: [125982.993406] 17790 pages reserved May 14 11:02:47 tor1 kernel: [125982.993406] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name May 14 11:02:47 tor1 kernel: [125982.993422] ![32898]     0 32898     5073        0      13      251             0 upstart-udev-br May 14 11:02:47 tor1 kernel: [125982.993423] ![32903]     0 32903    12900        1      27      228         -1000 systemd-udevd May 14 11:02:47 tor1 kernel: [125982.993425] ![33049]     0 33049     3821        0      11       48             0 upstart-file-br May 14 11:02:47 tor1 kernel: [125982.993426] ![33092]   102 33092     9813        1      23       95             0 dbus-daemon May 14 11:02:47 tor1 kernel: [125982.993436] ![33096]   101 33096    63962        0      28      722             0 rsyslogd May 14 11:02:47 tor1 kernel: [125982.993437] ![33119]     0 33119    10864        1      26       88             0 systemd-logind May 14 11:02:47 tor1 kernel: [125982.993439] ![33206]     0 33206     3817        0      12       55             0 upstart-socket- May 14 11:02:47 tor1 kernel: [125982.993440] ![33053]     0 33053     3699        1      12       41             0 getty May 14 11:02:47 tor1 kernel: [125982.993441] ![33056]     0 33056     3699        1      12       41             0 getty May 14 11:02:47 tor1 kernel: [125982.993442] ![33061]     0 33061     3699        1      12       39             0 getty May 14 11:02:47 tor1 kernel: [125982.993443] ![33062]     0 33062     3699        1      11       40             0 getty May 14 11:02:47 tor1 kernel: [125982.993444] ![33064]     0 33064     3699        1      12       39             0 getty May 14 11:02:47 tor1 kernel: [125982.993446] ![33079]     0 33079     1094        0       8       35             0 acpid May 14 11:02:47 tor1 kernel: [125982.993447] ![33089]     0 33089    15347        1      33      171         -1000 sshd May 14 11:02:47 tor1 kernel: [125982.993448] ![33094]     0 33094     5915        0      17       81             0 cron May 14 11:02:47 tor1 kernel: [125982.993449] ![33095]     0 33095     4786        0      12       48             0 atd May 14 11:02:47 tor1 kernel: [125982.993450] ![33155]     0 33155     4801       34      13       73             0 irqbalance May 14 11:02:47 tor1 kernel: [125982.993457] ![33180]   106 33180   111003        0     152    50520             0 tor May 14 11:02:47 tor1 kernel: [125982.993458] ![33317]   108 33317    20352        0      41     9936             0 unbound May 14 11:02:47 tor1 kernel: [125982.993459] ![33369]   105 33369     1871       12       9       33             0 vnstatd May 14 11:02:47 tor1 kernel: [125982.993460] ![32890]     0 32890     3699        1      11       40             0 getty May 14 11:02:47 tor1 kernel: [125982.993461] ![32991]  1002 32991     2867        0      10       84             0 memlog.sh May 14 11:02:47 tor1 kernel: [125982.993462] ![33057]   106 33057    97196        0     128    36538             0 tor May 14 11:02:47 tor1 kernel: [125982.993464] ![57323]  1002 57323     2867        0      10       91             0 memlog.sh May 14 11:02:47 tor1 kernel: [125982.993465] ![57325]  1002 57325     2218       78      10       32             0 grep May 14 11:02:47 tor1 kernel: [125982.993466] ![57326]  1002 57326     2746       44      11       46             0 sed May 14 11:02:47 tor1 kernel: [125982.993467] Out of memory: Kill process 33180 (tor) score 25 or sacrifice child May 14 11:02:47 tor1 kernel: [125982.993478] Killed process 33180 (tor) !total-vm:444012kB, !anon-rss:0kB, !file-rss:0kB ``` notices.log does not show anything around the kill time. First record ist the restart of the process at 20:32 ``` May 14 06:54:03.000 [notice] Tor 0.3.0.5-rc (git-61f68662421142d2) opening new log file. May 14 07:13:01.000 [warn] eventdns: All nameservers have failed May 14 07:13:01.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up May 14 07:47:36.000 [warn] eventdns: All nameservers have failed May 14 07:47:36.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up May 14 08:10:38.000 [warn] eventdns: All nameservers have failed May 14 08:10:38.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up May 14 08:12:28.000 [warn] eventdns: All nameservers have failed May 14 08:12:28.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up May 14 08:20:08.000 [warn] eventdns: All nameservers have failed May 14 08:20:09.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up May 14 20:23:18.000 [notice] Tor 0.3.0.5-rc (git-61f68662421142d2) opening log file. May 14 20:23:18.000 [notice] Parsing GEOIP IPv4 file /usr/share/tor/geoip. May 14 20:23:18.000 [notice] Parsing GEOIP IPv6 file /usr/share/tor/geoip6. May 14 20:23:18.000 [notice] Configured to measure statistics. Look for the *-stats files that will first be written to the data directory in 24 hours from now. ``` We are operation the * [Exit Servers](https://atlas.torproject.org/#search/digiges) since years and they became unstable about a week ago. No changes other the normal software updates where performed. Other than the two tor processes an local unbound resolver and ssh are running. It feels like an memory leak. Since tor is the one getting killed I assume the problem would be the tor software. Any thoughts ? **Trac**: **Username**: DeS
issue