Existe um bug de manipulação OOM na versão 4.4.0-59 do kernel do Ubuntu que você parece estar usando: link . Você pode reverter para o kernel mais antigo ou fazer o download de um novo kernel fixo que foi postado lá.
Eu estou tendo problema com o OOM mata o aplicativo mesmo com muito swap
Jan 21 06:25:19[166423.248706] Free swap = 3997348kB
Jan 21 06:25:19[166423.248708] Total swap = 4194300kB
Eu li linux não usando swap, mas OOM killer é acionado e testado que meu sistema permite alocar mais de 4gb executando
stress --vm 1 --vm-bytes 4096M --timeout 10s
Então, qual poderia ser o motivo da OOM?
Jan 21 06:25:19[166423.248287] lxcfs invoked oom-killer: gfp_mask=0x26000c0, order=2, oom_score_adj=0
Jan 21 06:25:19[166423.248295] lxcfs cpuset=/ mems_allowed=0
Jan 21 06:25:19[166423.248334] CPU: 0 PID: 9532 Comm: lxcfs Not tainted 4.4.0-59-generic #80-Ubuntu
Jan 21 06:25:19[166423.248337] Hardware name: DigitalOcean Droplet, BIOS 20161103 11/03/2016
Jan 21 06:25:19[166423.248340] 0000000000000286 0000000061807b60 ffff88001be53af0 ffffffff813f7583
Jan 21 06:25:19[166423.248348] ffff88001be53cc8 ffff88001d247000 ffff88001be53b60 ffffffff8120ad5e
Jan 21 06:25:19[166423.248352] ffffffff81cd2dc7 0000000000000000 ffffffff81e67760 0000000000000206
Jan 21 06:25:19[166423.248356] Call Trace:
Jan 21 06:25:19[166423.248424] [<ffffffff813f7583>] dump_stack+0x63/0x90
Jan 21 06:25:19[166423.248455] [<ffffffff8120ad5e>] dump_header+0x5a/0x1c5
Jan 21 06:25:19[166423.248474] [<ffffffff81192722>] oom_kill_process+0x202/0x3c0
Jan 21 06:25:19[166423.248477] [<ffffffff81192b49>] out_of_memory+0x219/0x460
Jan 21 06:25:19[166423.248487] [<ffffffff81198abd>] __alloc_pages_slowpath.constprop.88+0x8fd/0xa70
Jan 21 06:25:19[166423.248492] [<ffffffff81198eb6>] __alloc_pages_nodemask+0x286/0x2a0
Jan 21 06:25:19[166423.248496] [<ffffffff81198f6b>] alloc_kmem_pages_node+0x4b/0xc0
Jan 21 06:25:19[166423.248517] [<ffffffff8107ea5e>] copy_process+0x1be/0x1b70
Jan 21 06:25:19[166423.248536] [<ffffffff811c1db1>] ? handle_mm_fault+0x1421/0x1820
Jan 21 06:25:19[166423.248540] [<ffffffff810805a0>] _do_fork+0x80/0x360
Jan 21 06:25:19[166423.248544] [<ffffffff81080929>] SyS_clone+0x19/0x20
Jan 21 06:25:19[166423.248575] [<ffffffff818384f2>] entry_SYSCALL_64_fastpath+0x16/0x71
Jan 21 06:25:19[166423.248579] Mem-Info:
Jan 21 06:25:19[166423.248592] active_anon:37699 inactive_anon:38637 isolated_anon:0
Jan 21 06:25:19[166423.248592] active_file:12790 inactive_file:10954 isolated_file:0
Jan 21 06:25:19[166423.248592] unevictable:914 dirty:933 writeback:0 unstable:0
Jan 21 06:25:19[166423.248592] slab_reclaimable:11345 slab_unreclaimable:3766
Jan 21 06:25:19[166423.248592] mapped:15752 shmem:6941 pagetables:1685 bounce:0
Jan 21 06:25:19[166423.248592] free:2974 free_pcp:0 free_cma:0
Jan 21 06:25:19[166423.248601] Node 0 DMA free:2052kB min:88kB low:108kB high:132kB active_anon:3096kB inactive_anon:3844kB active_file:1736kB inactive_file:1496kB unevictable:164kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:164kB dirty:40kB writeback:0kB mapped:3120kB shmem:828kB slab_reclaimable:2300kB slab_unreclaimable:432kB kernel_stack:176kB pagetables:172kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 21 06:25:19[166423.248615] lowmem_reserve[]: 0 455 455 455 455
Jan 21 06:25:19[166423.248635] Node 0 DMA32 free:9844kB min:2684kB low:3352kB high:4024kB active_anon:147700kB inactive_anon:150704kB active_file:49424kB inactive_file:42320kB unevictable:3492kB isolated(anon):0kB isolated(file):0kB present:507896kB managed:484228kB mlocked:3492kB dirty:3692kB writeback:0kB mapped:59888kB shmem:26936kB slab_reclaimable:43080kB slab_unreclaimable:14632kB kernel_stack:4048kB pagetables:6568kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 21 06:25:19[166423.248646] lowmem_reserve[]: 0 0 0 0 0
Jan 21 06:25:19[166423.248651] Node 0 DMA: 339*4kB (UME) 87*8kB (UME) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2052kB
Jan 21 06:25:19[166423.248668] Node 0 DMA32: 2065*4kB (UMEH) 121*8kB (UMEH) 8*16kB (H) 2*32kB (H) 1*64kB (H) 1*128kB (H) 1*256kB (H) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 9868kB
Jan 21 06:25:19[166423.248695] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Jan 21 06:25:19[166423.248697] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jan 21 06:25:19[166423.248699] 34679 total pagecache pages
Jan 21 06:25:19[166423.248702] 3393 pages in swap cache
Jan 21 06:25:19[166423.248704] Swap cache stats: add 3146727, delete 3143334, find 1829619/2085514
Jan 21 06:25:19[166423.248706] Free swap = 3997348kB
Jan 21 06:25:19[166423.248708] Total swap = 4194300kB
Jan 21 06:25:19[166423.248710] 130972 pages RAM
Jan 21 06:25:19[166423.248711] 0 pages HighMem/MovableOnly
Jan 21 06:25:19[166423.248713] 5938 pages reserved
Jan 21 06:25:19[166423.248715] 0 pages cma reserved
Jan 21 06:25:19[166423.248716] 0 pages hwpoisoned
Jan 21 06:25:19[166423.248718] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
Jan 21 06:25:19[166423.248742] [ 666] 0 666 9072 524 21 3 586 0 systemd-journal
Jan 21 06:25:19[166423.248748] [ 709] 0 709 25742 399 19 3 15 0 lvmetad
Jan 21 06:25:19[166423.248760] [ 750] 0 750 10626 355 22 3 221 -1000 systemd-udevd
Jan 21 06:25:19[166423.248765] [ 823] 100 823 25081 321 19 3 64 0 systemd-timesyn
Jan 21 06:25:19[166423.248770] [ 1646] 0 1646 1306 8 8 3 22 0 iscsid
Jan 21 06:25:19[166423.248775] [ 1647] 0 1647 1431 877 8 3 0 -17 iscsid
Jan 21 06:25:19[166423.248779] [ 1652] 104 1652 64099 413 27 3 199 0 rsyslogd
Jan 21 06:25:19[166423.248783] [ 1653] 107 1653 10726 510 26 4 59 -900 dbus-daemon
Jan 21 06:25:19[166423.248787] [ 1663] 65534 1663 29630 2839 21 6 192 0 do-agent
Jan 21 06:25:19[166423.248792] [ 1673] 0 1673 68622 101 36 3 83 0 accounts-daemon
Jan 21 06:25:19[166423.248796] [ 1676] 0 1676 6511 374 18 3 26 0 atd
Jan 21 06:25:19[166423.248801] [ 1682] 0 1682 1100 290 8 3 31 0 acpid
Jan 21 06:25:19[166423.248805] [ 1687] 0 1687 7137 482 19 3 43 0 systemd-logind
Jan 21 06:25:19[166423.248809] [ 1700] 0 1700 6932 458 18 3 45 0 cron
Jan 21 06:25:19[166423.248813] [ 1707] 0 1707 67816 3034 27 5 218 0 filebeat
Jan 21 06:25:19[166423.248817] [ 1710] 0 1710 159019 151 32 4 180 0 lxcfs
Jan 21 06:25:19[166423.248821] [ 1714] 0 1714 102162 1369 31 5 222 0 topbeat
Jan 21 06:25:19[166423.248825] [ 1717] 0 1717 51580 43 27 6 1559 0 snapd
Jan 21 06:25:19[166423.248835] [ 1723] 0 1723 16380 555 35 3 153 -1000 sshd
Jan 21 06:25:19[166423.248840] [ 1759] 0 1759 9083 474 19 3 112 0 openvpn
Jan 21 06:25:19[166423.248844] [ 1845] 0 1845 69295 449 38 3 57 0 polkitd
Jan 21 06:25:19[166423.248848] [ 1847] 0 1847 3665 305 12 3 38 0 agetty
Jan 21 06:25:19[166423.248852] [ 1851] 0 1851 3619 366 12 3 37 0 agetty
Jan 21 06:25:19[166423.248856] [ 1900] 0 1900 3344 11 11 3 27 0 mdadm
Jan 21 06:25:19[166423.248873] [ 1988] 0 1988 22479 1404 48 3 57 0 apache2
Jan 21 06:25:19[166423.248878] [ 1993] 112 1993 73346 599 69 4 393 -900 postgres
Jan 21 06:25:19[166423.248882] [ 2062] 112 2062 73379 4431 76 4 410 0 postgres
Jan 21 06:25:19[166423.248887] [ 2063] 112 2063 73346 296 60 4 411 0 postgres
Jan 21 06:25:19[166423.248891] [ 2064] 112 2064 73346 1227 61 4 418 0 postgres
Jan 21 06:25:19[166423.248895] [ 2065] 112 2065 73440 884 64 4 411 0 postgres
Jan 21 06:25:19[166423.248899] [ 2066] 112 2066 37126 104 57 3 404 0 postgres
Jan 21 06:25:19[166423.248904] [15573] 113 15573 788794 61244 257 6 35959 0 java
Jan 21 06:25:19[166423.248909] [15617] 112 15617 74234 5035 77 4 865 0 postgres
Jan 21 06:25:19[166423.248913] [17217] 112 17217 74302 5378 78 4 613 0 postgres
Jan 21 06:25:19[166423.248918] [19541] 112 19541 74234 7416 79 4 459 0 postgres
Jan 21 06:25:19[166423.248923] [22045] 0 22045 12235 617 29 3 11 0 cron
Jan 21 06:25:19[166423.248927] [22047] 0 22047 1127 168 8 3 0 0 sh
Jan 21 06:25:19[166423.248931] [22050] 0 22050 1092 363 7 3 0 0 run-parts
Jan 21 06:25:19[166423.248936] [22186] 33 22186 94770 951 75 3 51 0 apache2
Jan 21 06:25:19[166423.249448] [22187] 33 22187 94770 951 75 3 51 0 apache2
Jan 21 06:25:19[166423.249678] [22247] 0 22247 2810 641 11 3 0 0 mlocate
Jan 21 06:25:19[166423.249682] [22252] 0 22252 2564 83 9 3 0 0 flock
Jan 21 06:25:19[166423.249918] [22253] 0 22253 1523 469 8 3 0 0 updatedb.mlocat
Jan 21 06:25:19[166423.249921] Out of memory: Kill process 15573 (java) score 83 or sacrifice child
Jan 21 06:25:19[166423.267860] Killed process 15573 (java) total-vm:3155176kB, anon-rss:238184kB, file-rss:6792kB
Existe um bug de manipulação OOM na versão 4.4.0-59 do kernel do Ubuntu que você parece estar usando: link . Você pode reverter para o kernel mais antigo ou fazer o download de um novo kernel fixo que foi postado lá.