Por que o killer do Linux OOM foi ativado antes de usar todo o swap?

1

Eu tenho um problema onde meu processo Java é morto pelo killer da OOM do kernel. Eu não tenho certeza porque isso está acontecendo, porque de acordo com o syslog eu ainda tinha espaço de troca livre:

Jan 15 08:52:24 xyz-server kernel: Free swap = 3885844kB
Jan 15 08:52:24 xyz-server kernel: Total swap = 4194296kB

Eu configurei vm.swappiness como 0. Eu entendi que isso significa que o kernel irá trocar somente se puder evitar uma situação de OOM, então eu pensei que seria ok. Foi uma má ideia?

Estou executando o Centos 6 e anexei o syslog completo abaixo:

Jan 15 08:52:24 xyz-server kernel: shibd invoked oom-killer: gfp_mask=0xd0, order=1, oom_adj=0, oom_score_adj=0
Jan 15 08:52:24 xyz-server kernel: shibd cpuset=/ mems_allowed=0
Jan 15 08:52:24 xyz-server kernel: Pid: 18630, comm: shibd Tainted: G W --------------- 2.6.32-358.14.1.el6.x86_64 #1
Jan 15 08:52:24 xyz-server kernel: Call Trace:
Jan 15 08:52:24 xyz-server kernel: [<ffffffff810cb561>] ? cpuset_print_task_mems_allowed+0x91/0xb0
Jan 15 08:52:24 xyz-server kernel: [<ffffffff8111cd80>] ? dump_header+0x90/0x1b0
Jan 15 08:52:24 xyz-server kernel: [<ffffffff8121d2ac>] ? security_real_capable_noaudit+0x3c/0x70
Jan 15 08:52:24 xyz-server kernel: [<ffffffff8111d202>] ? oom_kill_process+0x82/0x2a0
Jan 15 08:52:24 xyz-server kernel: [<ffffffff8111d141>] ? select_bad_process+0xe1/0x120
Jan 15 08:52:24 xyz-server kernel: [<ffffffff8111d640>] ? out_of_memory+0x220/0x3c0
Jan 15 08:52:24 xyz-server kernel: [<ffffffff8112c2ec>] ? __alloc_pages_nodemask+0x8ac/0x8d0
Jan 15 08:52:24 xyz-server kernel: [<ffffffff811609ea>] ? alloc_pages_current+0xaa/0x110
Jan 15 08:52:24 xyz-server kernel: [<ffffffff81129cce>] ? __get_free_pages+0xe/0x50
Jan 15 08:52:24 xyz-server kernel: [<ffffffff8106bf14>] ? copy_process+0xe4/0x1450
Jan 15 08:52:24 xyz-server kernel: [<ffffffff8104759c>] ? __do_page_fault+0x1ec/0x480
Jan 15 08:52:24 xyz-server kernel: [<ffffffff8106d314>] ? do_fork+0x94/0x460
Jan 15 08:52:24 xyz-server kernel: [<ffffffff81009598>] ? sys_clone+0x28/0x30
Jan 15 08:52:24 xyz-server kernel: [<ffffffff8100b393>] ? stub_clone+0x13/0x20
Jan 15 08:52:24 xyz-server kernel: [<ffffffff8100b072>] ? system_call_fastpath+0x16/0x1b
Jan 15 08:52:24 xyz-server kernel: Mem-Info:
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA per-cpu:
Jan 15 08:52:24 xyz-server kernel: CPU 0: hi: 0, btch: 1 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 1: hi: 0, btch: 1 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 2: hi: 0, btch: 1 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 3: hi: 0, btch: 1 usd: 0
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA32 per-cpu:
Jan 15 08:52:24 xyz-server kernel: CPU 0: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 1: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 2: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 3: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: Node 0 Normal per-cpu:
Jan 15 08:52:24 xyz-server kernel: CPU 0: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 1: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 2: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: CPU 3: hi: 186, btch: 31 usd: 0
Jan 15 08:52:24 xyz-server kernel: active_anon:3079090 inactive_anon:392870 isolated_anon:10
Jan 15 08:52:24 xyz-server kernel: active_file:51 inactive_file:131 isolated_file:0
Jan 15 08:52:24 xyz-server kernel: unevictable:0 dirty:0 writeback:2 unstable:0
Jan 15 08:52:24 xyz-server kernel: free:30217 slab_reclaimable:13388 slab_unreclaimable:11090
Jan 15 08:52:24 xyz-server kernel: mapped:81 shmem:142 pagetables:13866 bounce:0
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA free:15528kB min:68kB low:84kB high:100kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15136kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jan 15 08:52:24 xyz-server kernel: lowmem_reserve[]: 0 3000 5020 5020
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA32 free:22312kB min:14224kB low:17780kB high:21336kB active_anon:2192960kB inactive_anon:559136kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3072096kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:1228kB slab_unreclaimable:1940kB kernel_stack:616kB pagetables:716kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 15 08:52:24 xyz-server kernel: lowmem_reserve[]: 0 0 2020 2020
Jan 15 08:52:24 xyz-server kernel: Node 0 Normal free:83028kB min:53284kB low:66604kB high:79924kB active_anon:10123400kB inactive_anon:1012344kB active_file:204kB inactive_file:524kB unevictable:0kB isolated(anon):40kB isolated(file):0kB present:11505664kB mlocked:0kB dirty:0kB writeback:8kB mapped:324kB shmem:568kB slab_reclaimable:52324kB slab_unreclaimable:42420kB kernel_stack:3688kB pagetables:54748kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Jan 15 08:52:24 xyz-server kernel: lowmem_reserve[]: 0 0 0 0
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA: 2*4kB 2*8kB 1*16kB 2*32kB 1*64kB 0*128kB 0*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15528kB
Jan 15 08:52:24 xyz-server kernel: Node 0 DMA32: 99*4kB 114*8kB 141*16kB 108*32kB 85*64kB 49*128kB 12*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 22316kB
Jan 15 08:52:24 xyz-server kernel: Node 0 Normal: 14355*4kB 455*8kB 209*16kB 118*32kB 70*64kB 27*128kB 15*256kB 2*512kB 0*1024kB 1*2048kB 0*4096kB = 83028kB
Jan 15 08:52:24 xyz-server kernel: 7840 total pagecache pages
Jan 15 08:52:24 xyz-server kernel: 7519 pages in swap cache
Jan 15 08:52:24 xyz-server kernel: Swap cache stats: add 2995034, delete 2987515, find 314611560/314790727
Jan 15 08:52:24 xyz-server kernel: Free swap = 3885844kB
Jan 15 08:52:24 xyz-server kernel: Total swap = 4194296kB
Jan 15 08:52:24 xyz-server kernel: 3670000 pages RAM
Jan 15 08:52:24 xyz-server kernel: 71979 pages reserved
Jan 15 08:52:24 xyz-server kernel: 21635 pages shared
Jan 15 08:52:24 xyz-server kernel: 3528964 pages non-shared
Jan 15 08:52:24 xyz-server kernel: [ pid ] uid tgid total_vm rss cpu oom_adj oom_score_adj name
Jan 15 08:52:24 xyz-server kernel: [ 442] 0 442 2659 2 0 -17 -1000 udevd
Jan 15 08:52:24 xyz-server kernel: [ 1109] 0 1109 17425 81 3 0 0 vmtoolsd
Jan 15 08:52:24 xyz-server kernel: [ 1253] 0 1253 23299 2 1 -17 -1000 auditd
Jan 15 08:52:24 xyz-server kernel: [ 1269] 0 1269 62464 37 1 0 0 rsyslogd
Jan 15 08:52:24 xyz-server kernel: [ 1287] 32 1287 4743 1 0 0 0 rpcbind
Jan 15 08:52:24 xyz-server kernel: [ 1323] 29 1323 5836 1 0 0 0 rpc.statd
Jan 15 08:52:24 xyz-server kernel: [ 1351] 0 1351 6290 1 1 0 0 rpc.idmapd
Jan 15 08:52:24 xyz-server kernel: [ 1372] 81 1372 5895 1 1 0 0 dbus-daemon
Jan 15 08:52:24 xyz-server kernel: [ 1383] 70 1383 7434 2 3 0 0 avahi-daemon
Jan 15 08:52:24 xyz-server kernel: [ 1384] 70 1384 7434 1 1 0 0 avahi-daemon
Jan 15 08:52:24 xyz-server kernel: [ 1423] 0 1423 113067 1 0 0 0 automount
Jan 15 08:52:24 xyz-server kernel: [ 1443] 0 1443 16563 15 3 -17 -1000 sshd
Jan 15 08:52:24 xyz-server kernel: [ 1564] 0 1564 29312 8 1 0 0 crond
Jan 15 08:52:24 xyz-server kernel: [ 1572] 0 1572 6281 1 1 0 0 oddjobd
Jan 15 08:52:24 xyz-server kernel: [ 1604] 0 1604 1014 1 0 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [ 1606] 0 1606 1014 1 0 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [ 1608] 0 1608 1014 1 0 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [ 1610] 0 1610 1014 1 0 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [ 1612] 0 1612 1014 1 0 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [ 2942] 0 2942 258414 1 1 0 0 console-kit-dae
Jan 15 08:52:24 xyz-server kernel: [23467] 38 23467 8059 17 1 0 0 ntpd
Jan 15 08:52:24 xyz-server kernel: [27532] 0 27532 2658 2 3 -17 -1000 udevd
Jan 15 08:52:24 xyz-server kernel: [ 2172] 0 2172 65647 620 1 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [ 8027] 0 8027 1014 1 1 0 0 mingetty
Jan 15 08:52:24 xyz-server kernel: [18630] 498 18630 636245 14471 3 0 0 shibd
Jan 15 08:52:24 xyz-server kernel: [18719] 0 18719 49858 30 0 0 0 sssd
Jan 15 08:52:24 xyz-server kernel: [18720] 0 18720 74218 2279 0 0 0 sssd_be
Jan 15 08:52:24 xyz-server kernel: [18721] 0 18721 50428 56 3 0 0 sssd_nss
Jan 15 08:52:24 xyz-server kernel: [18722] 0 18722 48008 5 2 0 0 sssd_pam
Jan 15 08:52:24 xyz-server kernel: [18723] 0 18723 48703 1 3 0 0 sssd_ssh
Jan 15 08:52:24 xyz-server kernel: [18724] 0 18724 47528 4 1 0 0 sssd_sudo
Jan 15 08:52:24 xyz-server kernel: [18725] 0 18725 52553 1 2 0 0 sssd_pac
Jan 15 08:52:24 xyz-server kernel: [18749] 0 18749 15560 1 1 0 0 certmonger
Jan 15 08:52:24 xyz-server kernel: [18849] 0 18849 20820 14 1 0 0 master
Jan 15 08:52:24 xyz-server kernel: [18852] 89 18852 20883 2 0 0 0 qmgr
Jan 15 08:52:24 xyz-server kernel: [23143] 500 23143 4285995 3415938 2 0 0 java
Jan 15 08:52:24 xyz-server kernel: [23198] 0 23198 2658 2 0 -17 -1000 udevd
Jan 15 08:52:24 xyz-server kernel: [18831] 0 18831 8062 61 0 0 0 rotatelogs
Jan 15 08:52:24 xyz-server kernel: [18838] 0 18838 8062 62 0 0 0 rotatelogs
Jan 15 08:52:24 xyz-server kernel: [21841] 48 21841 104759 1396 2 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22089] 48 22089 104759 1382 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22178] 48 22178 104759 1365 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22234] 48 22234 104759 1367 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22241] 48 22241 104759 1359 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22261] 48 22261 104759 1368 2 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22272] 48 22272 104759 1363 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22296] 48 22296 104759 1375 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22336] 48 22336 104759 1364 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22348] 48 22348 104759 1354 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22349] 48 22349 104759 1365 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22356] 48 22356 104759 1361 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22361] 48 22361 104759 1364 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22372] 48 22372 104759 1356 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22375] 48 22375 104759 1352 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22389] 48 22389 104759 1362 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22390] 48 22390 104759 1360 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22397] 48 22397 104759 1357 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22398] 48 22398 104759 1359 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22401] 48 22401 104759 1358 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22430] 89 22430 20840 218 0 0 0 pickup
Jan 15 08:52:24 xyz-server kernel: [22435] 48 22435 104759 1354 1 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22441] 48 22441 104759 1348 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22457] 48 22457 104759 1345 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22461] 48 22461 104759 1337 3 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22464] 48 22464 104713 1307 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22465] 48 22465 104759 1332 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22470] 48 22470 104759 1338 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22471] 48 22471 104759 1337 3 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22472] 48 22472 104759 1347 3 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22473] 48 22473 104713 1308 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22483] 48 22483 104713 1408 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22487] 48 22487 104759 1430 1 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22488] 48 22488 104713 1397 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22490] 48 22490 104759 1472 1 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22496] 48 22496 85768 1404 2 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22497] 48 22497 85768 1404 2 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22498] 48 22498 85768 1404 2 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: [22504] 48 22504 88329 1408 0 0 0 httpd
Jan 15 08:52:24 xyz-server kernel: Out of memory: Kill process 23143 (java) score 748 or sacrifice child
Jan 15 08:52:24 xyz-server kernel: Killed process 23143, UID 500, (java) total-vm:17143980kB, anon-rss:13663732kB, file-rss:16kB
    
por palto 19.01.2015 / 15:53

1 resposta

1

Seu problema está claro anon-rss:13663732kB , a alocação do kernel pode dormir usando GFP (GET FREE PAGE) depende de quem é a alocação de memória, por exemplo se o servidor está com pouca memória e um usuário solicita 1M de memória, o kernel pode dormir e tentar liberar memória para satisfazer a requisição de memória, migrando a menor página de uso em swap, mas no seu caso o kernel tenta alocar duas páginas criando um processo do_fork, para o kernel que é caminho crítico e não pode dormir essa área.

    
por 19.01.2015 / 16:32