Minha tarefa dentro do contêiner do Docker está sendo eliminada devido à OOM. Aqui está um log de /var/log/messsages
.
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.346602] uwsgi invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.351446] uwsgi cpuset=4ad797e0720ad05c90cb8f5afaa9902172c4aac9319d464e669091615b52d134 mems_allowed=0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.356702] CPU: 0 PID: 3969 Comm: uwsgi Tainted: G E 4.1.13-19.31.amzn1.x86_64 #1
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.361608] Hardware name: Xen HVM domU, BIOS 4.2.amazon 12/07/2015
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.364705] ffff8800b841d000 ffff88003737bc88 ffffffff814dabc0 0000000000002ecc
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.368574] ffff880037821980 ffff88003737bd38 ffffffff814d8377 0000000000000046
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.372644] ffff880037a10000 ffff88003737bd08 ffffffff810949f1 00000000000000fb
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.377628] Call Trace:
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.379220] [<ffffffff814dabc0>] dump_stack+0x45/0x57
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.382510] [<ffffffff814d8377>] dump_header+0x7f/0x1fe
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.385580] [<ffffffff810949f1>] ? try_to_wake_up+0x1f1/0x340
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.388683] [<ffffffff8115cd37>] ? find_lock_task_mm+0x47/0xa0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.391730] [<ffffffff8115d2ec>] oom_kill_process+0x1cc/0x3b0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.394933] [<ffffffff81071f0e>] ? has_capability_noaudit+0x1e/0x30
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.398486] [<ffffffff811c2814>] mem_cgroup_oom_synchronize+0x574/0x5c0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.402132] [<ffffffff811be940>] ? mem_cgroup_css_online+0x260/0x260
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.406394] [<ffffffff8115dc44>] pagefault_out_of_memory+0x24/0xe0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.410541] [<ffffffff814d6c37>] mm_fault_error+0x5e/0x106
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.414017] [<ffffffff8105dd5c>] __do_page_fault+0x3ec/0x420
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.416967] [<ffffffff811e6577>] ? __fget_light+0x57/0x70
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.419762] [<ffffffff8105ddb2>] do_page_fault+0x22/0x30
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.422658] [<ffffffff814e3618>] page_fault+0x28/0x30
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.425484] Task in /docker/4ad797e0720ad05c90cb8f5afaa9902172c4aac9319d464e669091615b52d134 killed as a result of limit of /docker/4ad797e0720ad05c90cb8f5afaa9902172c4aac9319d464e669091615b52d134
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.435927] memory: usage 2560000kB, limit 2560000kB, failcnt 25331
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.439759] memory+swap: usage 2560000kB, limit 5120000kB, failcnt 0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.443422] kmem: usage 0kB, limit 9007199254740988kB, failcnt 0
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.446548] Memory cgroup stats for /docker/4ad797e0720ad05c90cb8f5afaa9902172c4aac9319d464e669091615b52d134: cache:3036KB rss:2556964KB rss_huge:0KB mapped_file:2304KB writeback:0KB swap:0KB inactive_anon:2128KB active_anon:2556964KB inactive_file:472KB active_file:436KB unevictable:0KB
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.460987] [ pid ] uid tgid total_vm rss nr_ptes nr_pmds swapents oom_score_adj name
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.466314] [ 3509] 1000 3509 1112 20 8 3 0 0 sh
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.471474] [ 3531] 1000 3531 4496 60 14 3 0 0 entrypoint.sh
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.476400] [ 3915] 1000 3915 15167 2629 34 3 0 0 supervisord
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.480969] [ 3968] 1000 3968 19371 287 40 3 0 0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.485910] [ 3969] 1000 3969 64026 21101 125 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.491238] [ 3970] 1000 3970 19855 778 38 3 0 0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.495781] [ 3971] 1000 3971 19855 778 38 3 0 0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.501098] [ 3972] 1000 3972 19888 784 38 3 0 0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.505670] [ 3973] 1000 3973 19855 780 38 3 0 0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.510069] [ 3974] 1000 3974 19855 780 38 3 0 0 nginx
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.514864] [ 4126] 1000 4126 66981 22660 129 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.520305] [ 4127] 1000 4127 66597 22268 128 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.525661] [ 4128] 1000 4128 66725 22435 128 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.530141] [ 4129] 1000 4129 66533 22221 128 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.534828] [ 4130] 1000 4130 66533 22222 128 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.539419] [ 4131] 1000 4131 66469 22183 128 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.544847] [ 4132] 1000 4132 66661 22363 128 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.550356] [ 4133] 1000 4133 68150 23216 131 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.554855] [ 4134] 1000 4134 67812 22904 131 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.559747] [ 4135] 1000 4135 373902 327874 731 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.564379] [ 4136] 1000 4136 70710 26423 138 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.569562] [ 4137] 1000 4137 66213 21886 127 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.575567] [ 4138] 1000 4138 80402 35499 160 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.580316] [ 4139] 1000 4139 77374 31183 150 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.584705] [ 4140] 1000 4140 96267 50678 197 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.589036] [ 4141] 1000 4141 119173 55555 211 4 0 0 uwsgi
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.593748] [ 4930] 1000 4930 4541 80 15 3 0 0 bash
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.599021] [ 4945] 1000 4945 85517 22532 135 4 0 0 python
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.605828] Memory cgroup out of memory: Kill process 4135 (uwsgi) score 513 or sacrifice child
Feb 17 19:01:24 ip-10-0-1-85 kernel: [16211.609448] Killed process 4135 (uwsgi) total-vm:1495608kB, anon-rss:1311424kB, file-rss:72kB
Mas por que diz que a memória total usou 2,6 GB e listou os processos e, se você os resumir, haverá menos de 1 GB de mem. E o processo real que foi morto diz ~ 360 MB, mas na realidade quando eu monitoro durante o OOM com htop
ele vai bem acima de 1 GB. Então a questão é: por que o kernel está reportando valor de memória incorreto?