Mesos → Tarefas concluídas no Sandbox
No arquivo stdout, eu posso ver o sinal killTask:
Received killTask for task sources.b4e2c8e6-5b42-11e7-aec0-024227901b13
O snap completo do arquivo stdout é o seguinte. Você pode ver mesmo depois de receber killTask
sinal de que meu processo ainda está em execução. ou seja, meu processo não se encerra.
2017-06-27 14:16:08,332 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:16:18,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:16:28,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:16:38,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:16:48,337 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:16:58,332 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:08,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:18,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:28,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:38,334 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:48,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:58,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:08,334 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:18,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:28,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:38,332 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:48,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:58,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:08,332 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:18,332 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:28,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:38,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:48,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:58,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:20:08,332 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:20:18,334 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:20:28,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:20:38,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:20:48,332 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
Received killTask for task sources.b4e2c8e6-5b42-11e7-aec0-024227901b13
2017-06-27 14:20:58,333 INFO [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
O Snap completo do arquivo stderr é o seguinte:
I0627 19:42:51.959991 7613 fetcher.cpp:533] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"executable":false,"extract":true,"value":"file:\/\/\/etc\/docker.tar.gz"}}],"sandbox_directory":"\/var\/lib\/mesos\/slaves\/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0\/frameworks\/0e528b66-37aa-4d7a-933e-4638aabf494a-0000\/executors\/sources.b4e2c8e6-5b42-11e7-aec0-024227901b13\/runs\/219c102b-28ae-41d5-b98f-11829315119e"}
I0627 19:42:51.963241 7613 fetcher.cpp:444] Fetching URI 'file:///etc/docker.tar.gz'
I0627 19:42:51.963279 7613 fetcher.cpp:285] Fetching directly into the sandbox directory
I0627 19:42:51.963295 7613 fetcher.cpp:222] Fetching URI 'file:///etc/docker.tar.gz'
I0627 19:42:51.964923 7613 fetcher.cpp:207] Copied resource '/etc/docker.tar.gz' to '/var/lib/mesos/slaves/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.b4e2c8e6-5b42-11e7-aec0-024227901b13/runs/219c102b-28ae-41d5-b98f-11829315119e/docker.tar.gz'
I0627 19:42:52.070482 7613 fetcher.cpp:123] Extracted '/var/lib/mesos/slaves/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.b4e2c8e6-5b42-11e7-aec0-024227901b13/runs/219c102b-28ae-41d5-b98f-11829315119e/docker.tar.gz' into '/var/lib/mesos/slaves/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.b4e2c8e6-5b42-11e7-aec0-024227901b13/runs/219c102b-28ae-41d5-b98f-11829315119e'
I0627 19:42:52.070533 7613 fetcher.cpp:582] Fetched 'file:///etc/docker.tar.gz' to '/var/lib/mesos/slaves/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.b4e2c8e6-5b42-11e7-aec0-024227901b13/runs/219c102b-28ae-41d5-b98f-11829315119e/docker.tar.gz'
I0627 19:42:56.096325 7643 exec.cpp:162] Version: 1.3.0
I0627 19:42:56.101958 7647 exec.cpp:237] Executor registered on agent 632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 221 100 138 100 83 8657 5207 --:--:-- --:--:-- --:--:-- 9200
E0627 19:51:03.219312 7652 process.cpp:951] Failed to accept socket: future discarded
Mensagens Seu kernel não suporta capacidades de limite de troca ou o cgroup não está montado. Memória limitada sem swap. e Falha ao aceitar soquete: o futuro descartado parece ser o culpado que está acabando com meu contêiner.
Minha pergunta é quem está matando meu contêiner depois de 5 a 10 minutos de novo e de novo?
Eu também atualizei o arquivo /etc/default/grub
com
GRUB_CMDLINE_LINUX_DEFAULT="cgroup_enable=memory swapaccount=1"
e reiniciei meu sistema, mas sem progresso.
Qualquer ideia sobre este assunto.
Minha configuração Ubuntu VMWare é como:
- Cores atribuídas 3
- memória: 6 GB
- HDD: 32 GB
- Estou executando apenas um contêiner que ainda sai depois de alguns minutos.
[EDIT: Adicionando o conteúdo do arquivo stderr da interface do mesos em: /var/lib/mesos/slaves/29df799b-4797-41df-a005-465f211d286b-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000 executors/sources.a634642c-5bbc-11e7-ba8b-024239f32c24/runs/1bda209c-c2b8-4bb5-a41b-26361e00a284
]
Adicionando o conteúdo do arquivo stderr de outro trabalho.
I0628 10:15:45.951104 4357 fetcher.cpp:533] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/29df799b-4797-41df-a005-465f211d286b-S0","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"executable":false,"extract":true,"value":"file:\/\/\/etc\/docker.tar.gz"}}],"sandbox_directory":"\/var\/lib\/mesos\/slaves\/29df799b-4797-41df-a005-465f211d286b-S0\/frameworks\/0e528b66-37aa-4d7a-933e-4638aabf494a-0000\/executors\/sources.a634642c-5bbc-11e7-ba8b-024239f32c24\/runs\/1bda209c-c2b8-4bb5-a41b-26361e00a284"}
I0628 10:15:45.953835 4357 fetcher.cpp:444] Fetching URI 'file:///etc/docker.tar.gz'
I0628 10:15:45.953881 4357 fetcher.cpp:285] Fetching directly into the sandbox directory
I0628 10:15:45.953974 4357 fetcher.cpp:222] Fetching URI 'file:///etc/docker.tar.gz'
I0628 10:15:45.956663 4357 fetcher.cpp:207] Copied resource '/etc/docker.tar.gz' to '/var/lib/mesos/slaves/29df799b-4797-41df-a005-465f211d286b-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.a634642c-5bbc-11e7-ba8b-024239f32c24/runs/1bda209c-c2b8-4bb5-a41b-26361e00a284/docker.tar.gz'
I0628 10:15:46.061069 4357 fetcher.cpp:123] Extracted '/var/lib/mesos/slaves/29df799b-4797-41df-a005-465f211d286b-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.a634642c-5bbc-11e7-ba8b-024239f32c24/runs/1bda209c-c2b8-4bb5-a41b-26361e00a284/docker.tar.gz' into '/var/lib/mesos/slaves/29df799b-4797-41df-a005-465f211d286b-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.a634642c-5bbc-11e7-ba8b-024239f32c24/runs/1bda209c-c2b8-4bb5-a41b-26361e00a284'
I0628 10:15:46.061148 4357 fetcher.cpp:582] Fetched 'file:///etc/docker.tar.gz' to '/var/lib/mesos/slaves/29df799b-4797-41df-a005-465f211d286b-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.a634642c-5bbc-11e7-ba8b-024239f32c24/runs/1bda209c-c2b8-4bb5-a41b-26361e00a284/docker.tar.gz'
I0628 10:15:49.898803 4389 exec.cpp:162] Version: 1.3.0
I0628 10:15:49.903390 4390 exec.cpp:237] Executor registered on agent 29df799b-4797-41df-a005-465f211d286b-S0
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 221 100 138 100 83 5385 3239 --:--:-- --:--:-- --:--:-- 11500
W0628 10:15:49.903390 4389 logging.cpp:91] RAW: Received signal SIGTERM from process 3287 of user 0; exiting
Nenhum novo registro foi criado no arquivo /var/lib/mesos-master.ERROR
hoje
Conteúdo do arquivo /var/log/mesos-master.WARNING
:
Log file created at: 2017/06/28 10:04:56
Running on machine: ubuntu
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
W0628 10:04:56.387049 3193 authenticator.cpp:512] No credentials provided, authentication requests will be refused
W0628 10:14:56.617103 3221 master.cpp:2011] Agent 632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0 (ubuntu) did not re-register within 10mins after master failover; marking it unreachable
O conteúdo do arquivo /var/log/mesos-slave.WARNING
é o mesmo que o arquivo mesos-slave.ERROR
.
Conteúdo do arquivo /var/log/mesos-slave.ERROR
:
Log file created at: 2017/06/28 10:05:00
Running on machine: ubuntu
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0628 10:05:00.712286 3287 shell.hpp:107] Command 'hadoop version 2>&1' failed; this is the output:
sh: 1: hadoop: not found
E0628 10:24:45.502921 3326 slave.cpp:4496] Failed to update resources for container 1bda209c-c2b8-4bb5-a41b-26361e00a284 of executor 'sources.a634642c-5bbc-11e7-ba8b-024239f32c24' running task sources.a634642c-5bbc-11e7-ba8b-024239f32c24 on status update for terminal task, destroying container: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/4469/cgroup: Failed to open file: No such file or directory
E0628 10:33:45.789072 3327 slave.cpp:4496] Failed to update resources for container 858170ce-0775-48be-8c85-3a1dbf320569 of executor 'sources.e7e069ed-5bbd-11e7-ba8b-024239f32c24' running task sources.e7e069ed-5bbd-11e7-ba8b-024239f32c24 on status update for terminal task, destroying container: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/5215/cgroup: Failed to open file: No such file or directory
Eu observei que a mensagem:
Failed to read /proc/5215/cgroup: Failed to open file: No such file or directory
vem somente quando o container / tarefa é morto. Considerando que esses arquivos existem para contêineres atuais em execução. Obrigado.