quem está matando meu contêiner docker?

1

Mesos → Tarefas concluídas no Sandbox No arquivo stdout, eu posso ver o sinal killTask:

Received killTask for task sources.b4e2c8e6-5b42-11e7-aec0-024227901b13

O snap completo do arquivo stdout é o seguinte. Você pode ver mesmo depois de receber killTask sinal de que meu processo ainda está em execução. ou seja, meu processo não se encerra.

 
2017-06-27 14:16:08,332 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:16:18,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:16:28,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:16:38,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:16:48,337 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:16:58,332 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:08,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:18,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:28,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:38,334 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:48,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 1, bytes sent 188 so far
2017-06-27 14:17:58,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:08,334 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:18,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:28,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:38,332 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:48,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:18:58,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:08,332 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:18,332 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:28,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:38,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:48,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:19:58,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:20:08,332 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:20:18,334 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:20:28,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:20:38,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
2017-06-27 14:20:48,332 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far
Received killTask for task sources.b4e2c8e6-5b42-11e7-aec0-024227901b13
2017-06-27 14:20:58,333 INFO  [Timer-0] com.informatica.vds.transport.ws.WSClient - appmonitor messages sent 2, bytes sent 376 so far

O Snap completo do arquivo stderr é o seguinte:

I0627 19:42:51.959991  7613 fetcher.cpp:533] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"executable":false,"extract":true,"value":"file:\/\/\/etc\/docker.tar.gz"}}],"sandbox_directory":"\/var\/lib\/mesos\/slaves\/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0\/frameworks\/0e528b66-37aa-4d7a-933e-4638aabf494a-0000\/executors\/sources.b4e2c8e6-5b42-11e7-aec0-024227901b13\/runs\/219c102b-28ae-41d5-b98f-11829315119e"}
I0627 19:42:51.963241  7613 fetcher.cpp:444] Fetching URI 'file:///etc/docker.tar.gz'
I0627 19:42:51.963279  7613 fetcher.cpp:285] Fetching directly into the sandbox directory
I0627 19:42:51.963295  7613 fetcher.cpp:222] Fetching URI 'file:///etc/docker.tar.gz'
I0627 19:42:51.964923  7613 fetcher.cpp:207] Copied resource '/etc/docker.tar.gz' to '/var/lib/mesos/slaves/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.b4e2c8e6-5b42-11e7-aec0-024227901b13/runs/219c102b-28ae-41d5-b98f-11829315119e/docker.tar.gz'
I0627 19:42:52.070482  7613 fetcher.cpp:123] Extracted '/var/lib/mesos/slaves/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.b4e2c8e6-5b42-11e7-aec0-024227901b13/runs/219c102b-28ae-41d5-b98f-11829315119e/docker.tar.gz' into '/var/lib/mesos/slaves/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.b4e2c8e6-5b42-11e7-aec0-024227901b13/runs/219c102b-28ae-41d5-b98f-11829315119e'
I0627 19:42:52.070533  7613 fetcher.cpp:582] Fetched 'file:///etc/docker.tar.gz' to '/var/lib/mesos/slaves/632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.b4e2c8e6-5b42-11e7-aec0-024227901b13/runs/219c102b-28ae-41d5-b98f-11829315119e/docker.tar.gz'
I0627 19:42:56.096325  7643 exec.cpp:162] Version: 1.3.0
I0627 19:42:56.101958  7647 exec.cpp:237] Executor registered on agent 632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   221  100   138  100    83   8657   5207 --:--:-- --:--:-- --:--:--  9200
E0627 19:51:03.219312  7652 process.cpp:951] Failed to accept socket: future discarded

Mensagens Seu kernel não suporta capacidades de limite de troca ou o cgroup não está montado. Memória limitada sem swap. e Falha ao aceitar soquete: o futuro descartado parece ser o culpado que está acabando com meu contêiner.

Minha pergunta é quem está matando meu contêiner depois de 5 a 10 minutos de novo e de novo?

Eu também atualizei o arquivo /etc/default/grub com

GRUB_CMDLINE_LINUX_DEFAULT="cgroup_enable=memory swapaccount=1"

e reiniciei meu sistema, mas sem progresso.

Qualquer ideia sobre este assunto.

Minha configuração Ubuntu VMWare é como:

  1. Cores atribuídas 3
  2. memória: 6 GB
  3. HDD: 32 GB
  4. Estou executando apenas um contêiner que ainda sai depois de alguns minutos.

[EDIT: Adicionando o conteúdo do arquivo stderr da interface do mesos em: /var/lib/mesos/slaves/29df799b-4797-41df-a005-465f211d286b-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000 executors/sources.a634642c-5bbc-11e7-ba8b-024239f32c24/runs/1bda209c-c2b8-4bb5-a41b-26361e00a284 ]

Adicionando o conteúdo do arquivo stderr de outro trabalho.

I0628 10:15:45.951104  4357 fetcher.cpp:533] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/29df799b-4797-41df-a005-465f211d286b-S0","items":[{"action":"BYPASS_CACHE","uri":{"cache":false,"executable":false,"extract":true,"value":"file:\/\/\/etc\/docker.tar.gz"}}],"sandbox_directory":"\/var\/lib\/mesos\/slaves\/29df799b-4797-41df-a005-465f211d286b-S0\/frameworks\/0e528b66-37aa-4d7a-933e-4638aabf494a-0000\/executors\/sources.a634642c-5bbc-11e7-ba8b-024239f32c24\/runs\/1bda209c-c2b8-4bb5-a41b-26361e00a284"}
I0628 10:15:45.953835  4357 fetcher.cpp:444] Fetching URI 'file:///etc/docker.tar.gz'
I0628 10:15:45.953881  4357 fetcher.cpp:285] Fetching directly into the sandbox directory
I0628 10:15:45.953974  4357 fetcher.cpp:222] Fetching URI 'file:///etc/docker.tar.gz'
I0628 10:15:45.956663  4357 fetcher.cpp:207] Copied resource '/etc/docker.tar.gz' to '/var/lib/mesos/slaves/29df799b-4797-41df-a005-465f211d286b-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.a634642c-5bbc-11e7-ba8b-024239f32c24/runs/1bda209c-c2b8-4bb5-a41b-26361e00a284/docker.tar.gz'
I0628 10:15:46.061069  4357 fetcher.cpp:123] Extracted '/var/lib/mesos/slaves/29df799b-4797-41df-a005-465f211d286b-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.a634642c-5bbc-11e7-ba8b-024239f32c24/runs/1bda209c-c2b8-4bb5-a41b-26361e00a284/docker.tar.gz' into '/var/lib/mesos/slaves/29df799b-4797-41df-a005-465f211d286b-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.a634642c-5bbc-11e7-ba8b-024239f32c24/runs/1bda209c-c2b8-4bb5-a41b-26361e00a284'
I0628 10:15:46.061148  4357 fetcher.cpp:582] Fetched 'file:///etc/docker.tar.gz' to '/var/lib/mesos/slaves/29df799b-4797-41df-a005-465f211d286b-S0/frameworks/0e528b66-37aa-4d7a-933e-4638aabf494a-0000/executors/sources.a634642c-5bbc-11e7-ba8b-024239f32c24/runs/1bda209c-c2b8-4bb5-a41b-26361e00a284/docker.tar.gz'
I0628 10:15:49.898803  4389 exec.cpp:162] Version: 1.3.0
I0628 10:15:49.903390  4390 exec.cpp:237] Executor registered on agent 29df799b-4797-41df-a005-465f211d286b-S0
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   221  100   138  100    83   5385   3239 --:--:-- --:--:-- --:--:-- 11500
W0628 10:15:49.903390  4389 logging.cpp:91] RAW: Received signal SIGTERM from process 3287 of user 0; exiting

Nenhum novo registro foi criado no arquivo /var/lib/mesos-master.ERROR hoje Conteúdo do arquivo /var/log/mesos-master.WARNING :

Log file created at: 2017/06/28 10:04:56
Running on machine: ubuntu
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
W0628 10:04:56.387049  3193 authenticator.cpp:512] No credentials provided, authentication requests will be refused
W0628 10:14:56.617103  3221 master.cpp:2011] Agent 632f9d21-ae71-4cca-95e4-63e2b3dbd78e-S0 (ubuntu) did not re-register within 10mins after master failover; marking it unreachable

O conteúdo do arquivo /var/log/mesos-slave.WARNING é o mesmo que o arquivo mesos-slave.ERROR . Conteúdo do arquivo /var/log/mesos-slave.ERROR :

Log file created at: 2017/06/28 10:05:00
Running on machine: ubuntu
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0628 10:05:00.712286  3287 shell.hpp:107] Command 'hadoop version 2>&1' failed; this is the output:
sh: 1: hadoop: not found
E0628 10:24:45.502921  3326 slave.cpp:4496] Failed to update resources for container 1bda209c-c2b8-4bb5-a41b-26361e00a284 of executor 'sources.a634642c-5bbc-11e7-ba8b-024239f32c24' running task sources.a634642c-5bbc-11e7-ba8b-024239f32c24 on status update for terminal task, destroying container: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/4469/cgroup: Failed to open file: No such file or directory
E0628 10:33:45.789072  3327 slave.cpp:4496] Failed to update resources for container 858170ce-0775-48be-8c85-3a1dbf320569 of executor 'sources.e7e069ed-5bbd-11e7-ba8b-024239f32c24' running task sources.e7e069ed-5bbd-11e7-ba8b-024239f32c24 on status update for terminal task, destroying container: Failed to determine cgroup for the 'cpu' subsystem: Failed to read /proc/5215/cgroup: Failed to open file: No such file or directory

Eu observei que a mensagem:

Failed to read /proc/5215/cgroup: Failed to open file: No such file or directory

vem somente quando o container / tarefa é morto. Considerando que esses arquivos existem para contêineres atuais em execução. Obrigado.

    
por ramanKC 27.06.2017 / 16:43

1 resposta

0

Parece que a maratona confia no usuário para a implementação da verificação de saúde. Ou seja, se estamos fornecendo verificação de integridade na configuração do aplicativo, temos que implementá-lo. Eu removi toda a verificação de integridade que forneci na configuração do aplicativo. Depois dessa maratona está mostrando a saúde do app como desconhecida, mas agora a maratona (especificamente mesos-slave) não mata a tarefa.

    
por 17.07.2017 / 07:46