Frequent OOM kills of tor process
We are currently expierencing frequent Out of memory kills of or tor processes on two boxes running ubuntu 14.04.
root@tor1:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
We experience this on following versions:
0.3.0.5-rc-1trusty+1 and 0.2.9.10-1trusty+1
I changed to the rc after the normal 2.9.10-1 Version keept having this problems. The 0.3.0.5-rc has the same problems.
From syslog
May 14 10:21:42 tor1 kernel: [123517.565688] UDP: bad checksum. From 82.247.214.125:51413 to 176.10.104.240:21446 ulen 38
May 14 11:02:47 tor1 kernel: [125982.993268] tor invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0
May 14 11:02:47 tor1 kernel: [125982.993270] tor cpuset=/ mems_allowed=0
May 14 11:02:47 tor1 kernel: [125982.993272] CPU: 2 PID: 33180 Comm: tor Not tainted 3.13.0-117-generic !#164-Ubuntu
May 14 11:02:47 tor1 kernel: [125982.993273] Hardware name: HP ProLiant ML310e Gen8 v2, BIOS P78 03/28/2014
May 14 11:02:47 tor1 kernel: [125982.993274] 0000000000000000 ffff8800a8679990 ffffffff8172d1c9 ffff88013c868000
May 14 11:02:47 tor1 kernel: [125982.993276] 0000000000000000 ffff8800a8679a18 ffffffff81727768 ffffffff8106a926
May 14 11:02:47 tor1 kernel: [125982.993278] ffff8800a86799f0 ffffffff810cb27c 0000000001320122 0000000000000000
May 14 11:02:47 tor1 kernel: [125982.993279] Call Trace:
May 14 11:02:47 tor1 kernel: [125982.993285] [<ffffffff8172d1c9>] dump_stack+0x64/0x82
May 14 11:02:47 tor1 kernel: [125982.993286] [<ffffffff81727768>] dump_header+0x7f/0x1f1
May 14 11:02:47 tor1 kernel: [125982.993289] [<ffffffff8106a926>] ? put_online_cpus+0x56/0x80
May 14 11:02:47 tor1 kernel: [125982.993291] [<ffffffff810cb27c>] ? rcu_oom_notify+0xcc/0xf0
May 14 11:02:47 tor1 kernel: [125982.993294] [<ffffffff81156d81>] oom_kill_process+0x201/0x360
May 14 11:02:47 tor1 kernel: [125982.993297] [<ffffffff812dde35>] ? security_capable_noaudit+0x15/0x20
May 14 11:02:47 tor1 kernel: [125982.993299] [<ffffffff81157511>] out_of_memory+0x471/0x4b0
May 14 11:02:47 tor1 kernel: [125982.993301] [<ffffffff8115d82c>] !__alloc_pages_nodemask+0xa6c/0xb90
May 14 11:02:47 tor1 kernel: [125982.993304] [<ffffffff8119e25a>] alloc_pages_vma+0x9a/0x140
May 14 11:02:47 tor1 kernel: [125982.993307] [<ffffffff811908eb>] read_swap_cache_async+0xeb/0x160
May 14 11:02:47 tor1 kernel: [125982.993308] [<ffffffff811909f8>] swapin_readahead+0x98/0xe0
May 14 11:02:47 tor1 kernel: [125982.993320] [<ffffffff8117e90b>] handle_mm_fault+0xa7b/0xf10
May 14 11:02:47 tor1 kernel: [125982.993322] [<ffffffff81739344>] !__do_page_fault+0x184/0x560
May 14 11:02:47 tor1 kernel: [125982.993325] [<ffffffff810a3af5>] ? set_next_entity+0x95/0xb0
May 14 11:02:47 tor1 kernel: [125982.993328] [<ffffffff810135db>] ? !__switch_to+0x16b/0x4f0
May 14 11:02:47 tor1 kernel: [125982.993329] [<ffffffff8173973a>] do_page_fault+0x1a/0x70
May 14 11:02:47 tor1 kernel: [125982.993331] [<ffffffff81735a68>] page_fault+0x28/0x30
May 14 11:02:47 tor1 kernel: [125982.993350] Mem-Info:
May 14 11:02:47 tor1 kernel: [125982.993351] Node 0 DMA per-cpu:
May 14 11:02:47 tor1 kernel: [125982.993352] CPU 0: hi: 0, btch: 1 usd: 0
May 14 11:02:47 tor1 kernel: [125982.993353] CPU 1: hi: 0, btch: 1 usd: 0
May 14 11:02:47 tor1 kernel: [125982.993353] CPU 2: hi: 0, btch: 1 usd: 0
May 14 11:02:47 tor1 kernel: [125982.993354] CPU 3: hi: 0, btch: 1 usd: 0
May 14 11:02:47 tor1 kernel: [125982.993355] Node 0 DMA32 per-cpu:
May 14 11:02:47 tor1 kernel: [125982.993356] CPU 0: hi: 186, btch: 31 usd: 25
May 14 11:02:47 tor1 kernel: [125982.993356] CPU 1: hi: 186, btch: 31 usd: 2
May 14 11:02:47 tor1 kernel: [125982.993357] CPU 2: hi: 186, btch: 31 usd: 0
May 14 11:02:47 tor1 kernel: [125982.993358] CPU 3: hi: 186, btch: 31 usd: 26
May 14 11:02:47 tor1 kernel: [125982.993358] Node 0 Normal per-cpu:
May 14 11:02:47 tor1 kernel: [125982.993359] CPU 0: hi: 186, btch: 31 usd: 0
May 14 11:02:47 tor1 kernel: [125982.993360] CPU 1: hi: 186, btch: 31 usd: 0
May 14 11:02:47 tor1 kernel: [125982.993360] CPU 2: hi: 186, btch: 31 usd: 0
May 14 11:02:47 tor1 kernel: [125982.993361] CPU 3: hi: 186, btch: 31 usd: 0
May 14 11:02:47 tor1 kernel: [125982.993363] active_!anon:0 inactive_!anon:0 isolated_!anon:0
May 14 11:02:47 tor1 kernel: [125982.993363] active_!file:141 inactive_!file:204 isolated_!file:0
May 14 11:02:47 tor1 kernel: [125982.993363] !unevictable:0 !dirty:0 !writeback:0 !unstable:0
May 14 11:02:47 tor1 kernel: [125982.993363] !free:20517 slab_!reclaimable:7176 slab_!unreclaimable:41948
May 14 11:02:47 tor1 kernel: [125982.993363] !mapped:80 !shmem:0 !pagetables:692 !bounce:0
May 14 11:02:47 tor1 kernel: [125982.993363] free_!cma:0
May 14 11:02:47 tor1 kernel: [125982.993365] Node 0 DMA !free:15344kB !min:268kB !low:332kB !high:400kB active_!anon:0kB inactive_!anon:0kB active_!file:4kB inactive_!file:28kB !unevictable:0kB isolated(anon):0kB isolated(file):0kB !present:15976kB !managed:15892kB !mlocked:0kB !dirty:0kB !writeback:8kB !mapped:36kB !shmem:0kB slab_!reclaimable:0kB slab_!unreclaimable:156kB kernel_!stack:0kB !pagetables:4kB !unstable:0kB !bounce:0kB free_!cma:0kB writeback_!tmp:0kB pages_!scanned:18 all_unreclaimable? no
May 14 11:02:47 tor1 kernel: [125982.993368] lowmem_reserve[]: 0 2688 3771 3771
May 14 11:02:47 tor1 kernel: [125982.993370] Node 0 DMA32 !free:51080kB !min:46532kB !low:58164kB !high:69796kB active_!anon:0kB inactive_!anon:0kB active_!file:516kB inactive_!file:788kB !unevictable:0kB isolated(anon):0kB isolated(file):0kB !present:2832272kB !managed:2753204kB !mlocked:0kB !dirty:0kB !writeback:0kB !mapped:284kB !shmem:0kB slab_!reclaimable:15612kB slab_!unreclaimable:83432kB kernel_!stack:312kB !pagetables:2064kB !unstable:0kB !bounce:0kB free_!cma:0kB writeback_!tmp:0kB pages_!scanned:17 all_unreclaimable? no
May 14 11:02:47 tor1 kernel: [125982.993372] lowmem_reserve[]: 0 0 1082 1082
May 14 11:02:47 tor1 kernel: [125982.993373] Node 0 Normal !free:15644kB !min:18732kB !low:23412kB !high:28096kB active_!anon:0kB inactive_!anon:0kB active_!file:44kB inactive_!file:0kB !unevictable:0kB isolated(anon):0kB isolated(file):0kB !present:1179644kB !managed:1108484kB !mlocked:0kB !dirty:0kB !writeback:0kB !mapped:0kB !shmem:0kB slab_!reclaimable:13092kB slab_!unreclaimable:84204kB kernel_!stack:2072kB !pagetables:700kB !unstable:0kB !bounce:0kB free_!cma:0kB writeback_!tmp:0kB pages_!scanned:203 all_unreclaimable? yes
May 14 11:02:47 tor1 kernel: [125982.993376] lowmem_reserve[]: 0 0 0 0
May 14 11:02:47 tor1 kernel: [125982.993377] Node 0 DMA: 3*4kB (M) 3*8kB (UM) 2*16kB (UM) 0*32kB 1*64kB (U) 1*128kB (M) 1*256kB (M) 1*512kB (M) 2*1024kB (UM) 2*2048kB (MR) 2*4096kB (M) = 15364kB
May 14 11:02:47 tor1 kernel: [125982.993384] Node 0 DMA32: 303*4kB (UE) 662*8kB (UEM) 418*16kB (UEM) 952*32kB (UEM) 51*64kB (UM) 1*128kB (M) 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB (R) = 51148kB
May 14 11:02:47 tor1 kernel: [125982.993394] Node 0 Normal: 375*4kB (UEM) 330*8kB (UEM) 649*16kB (UEM) 3*32kB (M) 2*64kB (R) 1*128kB (R) 1*256kB (R) 1*512kB (R) 0*1024kB 0*2048kB 0*4096kB = 15644kB
May 14 11:02:47 tor1 kernel: [125982.993401] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
May 14 11:02:47 tor1 kernel: [125982.993401] 385 total pagecache pages
May 14 11:02:47 tor1 kernel: [125982.993402] 20 pages in swap cache
May 14 11:02:47 tor1 kernel: [125982.993403] Swap cache stats: add 2432647, delete 2432627, find 700250/1142133
May 14 11:02:47 tor1 kernel: [125982.993404] Free swap = 3627940kB
May 14 11:02:47 tor1 kernel: [125982.993404] Total swap = 4026364kB
May 14 11:02:47 tor1 kernel: [125982.993405] 1006973 pages RAM
May 14 11:02:47 tor1 kernel: [125982.993405] 0 pages HighMem/MovableOnly
May 14 11:02:47 tor1 kernel: [125982.993406] 17790 pages reserved
May 14 11:02:47 tor1 kernel: [125982.993406] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
May 14 11:02:47 tor1 kernel: [125982.993422] ![32898] 0 32898 5073 0 13 251 0 upstart-udev-br
May 14 11:02:47 tor1 kernel: [125982.993423] ![32903] 0 32903 12900 1 27 228 -1000 systemd-udevd
May 14 11:02:47 tor1 kernel: [125982.993425] ![33049] 0 33049 3821 0 11 48 0 upstart-file-br
May 14 11:02:47 tor1 kernel: [125982.993426] ![33092] 102 33092 9813 1 23 95 0 dbus-daemon
May 14 11:02:47 tor1 kernel: [125982.993436] ![33096] 101 33096 63962 0 28 722 0 rsyslogd
May 14 11:02:47 tor1 kernel: [125982.993437] ![33119] 0 33119 10864 1 26 88 0 systemd-logind
May 14 11:02:47 tor1 kernel: [125982.993439] ![33206] 0 33206 3817 0 12 55 0 upstart-socket-
May 14 11:02:47 tor1 kernel: [125982.993440] ![33053] 0 33053 3699 1 12 41 0 getty
May 14 11:02:47 tor1 kernel: [125982.993441] ![33056] 0 33056 3699 1 12 41 0 getty
May 14 11:02:47 tor1 kernel: [125982.993442] ![33061] 0 33061 3699 1 12 39 0 getty
May 14 11:02:47 tor1 kernel: [125982.993443] ![33062] 0 33062 3699 1 11 40 0 getty
May 14 11:02:47 tor1 kernel: [125982.993444] ![33064] 0 33064 3699 1 12 39 0 getty
May 14 11:02:47 tor1 kernel: [125982.993446] ![33079] 0 33079 1094 0 8 35 0 acpid
May 14 11:02:47 tor1 kernel: [125982.993447] ![33089] 0 33089 15347 1 33 171 -1000 sshd
May 14 11:02:47 tor1 kernel: [125982.993448] ![33094] 0 33094 5915 0 17 81 0 cron
May 14 11:02:47 tor1 kernel: [125982.993449] ![33095] 0 33095 4786 0 12 48 0 atd
May 14 11:02:47 tor1 kernel: [125982.993450] ![33155] 0 33155 4801 34 13 73 0 irqbalance
May 14 11:02:47 tor1 kernel: [125982.993457] ![33180] 106 33180 111003 0 152 50520 0 tor
May 14 11:02:47 tor1 kernel: [125982.993458] ![33317] 108 33317 20352 0 41 9936 0 unbound
May 14 11:02:47 tor1 kernel: [125982.993459] ![33369] 105 33369 1871 12 9 33 0 vnstatd
May 14 11:02:47 tor1 kernel: [125982.993460] ![32890] 0 32890 3699 1 11 40 0 getty
May 14 11:02:47 tor1 kernel: [125982.993461] ![32991] 1002 32991 2867 0 10 84 0 memlog.sh
May 14 11:02:47 tor1 kernel: [125982.993462] ![33057] 106 33057 97196 0 128 36538 0 tor
May 14 11:02:47 tor1 kernel: [125982.993464] ![57323] 1002 57323 2867 0 10 91 0 memlog.sh
May 14 11:02:47 tor1 kernel: [125982.993465] ![57325] 1002 57325 2218 78 10 32 0 grep
May 14 11:02:47 tor1 kernel: [125982.993466] ![57326] 1002 57326 2746 44 11 46 0 sed
May 14 11:02:47 tor1 kernel: [125982.993467] Out of memory: Kill process 33180 (tor) score 25 or sacrifice child
May 14 11:02:47 tor1 kernel: [125982.993478] Killed process 33180 (tor) !total-vm:444012kB, !anon-rss:0kB, !file-rss:0kB
notices.log does not show anything around the kill time. First record ist the restart of the process at 20:32
May 14 06:54:03.000 [notice] Tor 0.3.0.5-rc (git-61f68662421142d2) opening new log file.
May 14 07:13:01.000 [warn] eventdns: All nameservers have failed
May 14 07:13:01.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up
May 14 07:47:36.000 [warn] eventdns: All nameservers have failed
May 14 07:47:36.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up
May 14 08:10:38.000 [warn] eventdns: All nameservers have failed
May 14 08:10:38.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up
May 14 08:12:28.000 [warn] eventdns: All nameservers have failed
May 14 08:12:28.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up
May 14 08:20:08.000 [warn] eventdns: All nameservers have failed
May 14 08:20:09.000 [notice] eventdns: Nameserver 127.0.0.1:53 is back up
May 14 20:23:18.000 [notice] Tor 0.3.0.5-rc (git-61f68662421142d2) opening log file.
May 14 20:23:18.000 [notice] Parsing GEOIP IPv4 file /usr/share/tor/geoip.
May 14 20:23:18.000 [notice] Parsing GEOIP IPv6 file /usr/share/tor/geoip6.
May 14 20:23:18.000 [notice] Configured to measure statistics. Look for the *-stats files that will first be written to the data directory in 24 hours from now.
We are operation the * Exit Servers since years and they became unstable about a week ago. No changes other the normal software updates where performed.
Other than the two tor processes an local unbound resolver and ssh are running.
It feels like an memory leak. Since tor is the one getting killed I assume the problem would be the tor software. Any thoughts ?
Trac:
Username: DeS