Como interpretar o seguinte padrão de dados de tempo de atividade?

3

Eu tenho um pequeno servidor web executando mysql e wordpress que depois de um tempo parece parar de processar solicitações da web. Eu não consigo nem entrar no servidor via ssh enquanto o cliente ssh atinge o tempo limite ao tentar estabelecer a conexão, com a única maneira de trazer de volta o servidor é fazer uma reinicialização difícil.

Eu deixei o ssh rodando com top durante um período de 10 horas para ver esse servidor morrer lentamente, e quando chegou ao ponto em que ele apareceu, o topo preso ainda estava funcionando. Consegui sair top desligar mysql e httpd e depois digitar repetidamente uptime e a média da carga passou de 101.73 para 0.01 no intervalo de 10 minutos após o encerramento de httpd e% código%.

Eu forneci dados que consegui coletar abaixo.

Minhas perguntas:

  • qual é o significado dos dados?
  • Esta máquina está sem CPU ou RAM?
  • Uma caixa maior resolveria o problema?
  • Quais outras ferramentas podem ser usadas para identificar a causa desse problema.

Aqui está um instantâneo de mysqld antes de sair e encerrar top e httpd

top - 11:00:18 up 13:54,  1 user,  load average: 96.13, 94.78, 90.06
Tasks: 173 total,   1 running, 172 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.5%us,  1.1%sy,  0.0%ni,  0.0%id, 98.4%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   1016284k total,  1008232k used,     8052k free,      580k buffers
Swap:  2096440k total,  2095168k used,     1272k free,     9872k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                          
    7 root      20   0     0    0    0 S  0.2  0.0   0:09.98 events/0                                                                                          
   18 root      20   0     0    0    0 S  0.1  0.0   0:11.66 kblockd/0                                                                                         
 1267 root      20   0  114m  316  232 S  0.1  0.0   0:00.41 crond         

 4779 apache    20   0  270m  11m  548 D  0.1  1.2   0:00.68 httpd                                                                                             
 4878 apache    20   0  261m  17m  896 D  0.1  1.8   0:00.44 httpd                                                                                             
 5046 apache    20   0  272m  19m 1168 D  0.1  1.9   0:00.69 httpd                                                                                             
 5258 apache    20   0  244m 2552 1300 D  0.1  0.3   0:00.01 httpd  
 ...... stuff I have removed to make this list short  
1532 root      20   0  105m    4    4 S  0.0  0.0   0:00.01 mysqld_safe                                                                                       
 1634 mysql     20   0  713m 8656 1612 S  0.0  0.9   1:13.79 mysqld                                                                                            
 1805 root      20   0  244m  976   80 S  0.0  0.1   0:03.43 httpd    

Dados do comando mysqld

 11:01:50 up 13:55,  1 user,  load average: 99.15, 95.94, 90.88
 11:05:19 up 13:59,  2 users,  load average: 101.73, 97.93, 92.65
 11:05:45 up 13:59,  2 users,  load average: 67.02, 90.07, 90.18
 11:07:27 up 14:01,  2 users,  load average: 11.61, 63.36, 80.53
 11:07:30 up 14:01,  2 users,  load average: 11.61, 63.36, 80.53
 11:07:35 up 14:01,  2 users,  load average: 10.68, 62.31, 80.10
 11:07:39 up 14:01,  2 users,  load average: 9.83, 61.28, 79.67
 11:07:41 up 14:01,  2 users,  load average: 9.04, 60.26, 79.24
 11:07:43 up 14:01,  2 users,  load average: 9.04, 60.26, 79.24
 11:07:48 up 14:01,  2 users,  load average: 8.31, 59.26, 78.82
 11:07:50 up 14:01,  2 users,  load average: 8.31, 59.26, 78.82
 11:07:52 up 14:01,  2 users,  load average: 7.65, 58.28, 78.39
 11:07:54 up 14:01,  2 users,  load average: 7.65, 58.28, 78.39
 11:07:56 up 14:01,  2 users,  load average: 7.65, 58.28, 78.39
 11:07:57 up 14:02,  2 users,  load average: 7.04, 57.31, 77.97
 11:07:58 up 14:02,  2 users,  load average: 7.04, 57.31, 77.97
 11:08:04 up 14:02,  2 users,  load average: 6.47, 56.36, 77.55
 11:08:05 up 14:02,  2 users,  load average: 6.47, 56.36, 77.55
 11:08:06 up 14:02,  2 users,  load average: 5.95, 55.42, 77.14
 11:08:08 up 14:02,  2 users,  load average: 5.95, 55.42, 77.14
 11:08:09 up 14:02,  2 users,  load average: 5.95, 55.42, 77.14
 11:08:10 up 14:02,  2 users,  load average: 5.95, 55.42, 77.14
 11:08:11 up 14:02,  2 users,  load average: 5.48, 54.50, 76.72
 11:08:12 up 14:02,  2 users,  load average: 5.48, 54.50, 76.72
 11:08:14 up 14:02,  2 users,  load average: 5.48, 54.50, 76.72
 11:08:15 up 14:02,  2 users,  load average: 5.48, 54.50, 76.72
 11:08:16 up 14:02,  2 users,  load average: 5.04, 53.60, 76.31
 11:08:17 up 14:02,  2 users,  load average: 5.04, 53.60, 76.31
 11:08:19 up 14:02,  2 users,  load average: 5.04, 53.60, 76.31
 11:08:20 up 14:02,  2 users,  load average: 5.04, 53.60, 76.31
 11:08:22 up 14:02,  2 users,  load average: 4.63, 52.70, 75.90
 11:08:23 up 14:02,  2 users,  load average: 4.63, 52.70, 75.90
 11:08:25 up 14:02,  2 users,  load average: 4.63, 52.70, 75.90
 11:08:26 up 14:02,  2 users,  load average: 4.26, 51.83, 75.49
 11:08:27 up 14:02,  2 users,  load average: 4.26, 51.83, 75.49
 11:08:28 up 14:02,  2 users,  load average: 4.26, 51.83, 75.49
 11:08:29 up 14:02,  2 users,  load average: 4.26, 51.83, 75.49
 11:08:33 up 14:02,  2 users,  load average: 3.92, 50.97, 75.09
 11:08:36 up 14:02,  2 users,  load average: 3.61, 50.12, 74.68
 11:08:38 up 14:02,  2 users,  load average: 3.61, 50.12, 74.68
 11:08:40 up 14:02,  2 users,  load average: 3.61, 50.12, 74.68
 11:08:41 up 14:02,  2 users,  load average: 3.32, 49.29, 74.28
 11:09:11 up 14:03,  2 users,  load average: 2.01, 44.58, 71.92
 11:09:13 up 14:03,  2 users,  load average: 2.01, 44.58, 71.92
 11:09:24 up 14:03,  2 users,  load average: 1.70, 43.11, 71.15
 11:09:25 up 14:03,  2 users,  load average: 1.70, 43.11, 71.15
 11:10:41 up 14:04,  2 users,  load average: 0.48, 33.53, 65.62
 11:10:43 up 14:04,  2 users,  load average: 0.44, 32.98, 65.27
 11:10:53 up 14:04,  2 users,  load average: 0.38, 31.89, 64.57
 11:10:55 up 14:04,  2 users,  load average: 0.38, 31.89, 64.57
 11:11:38 up 14:05,  2 users,  load average: 0.18, 27.43, 61.51
 11:11:40 up 14:05,  2 users,  load average: 0.18, 27.43, 61.51
 11:11:41 up 14:05,  2 users,  load average: 0.18, 27.43, 61.51
 11:11:41 up 14:05,  2 users,  load average: 0.16, 26.97, 61.18
 11:11:42 up 14:05,  2 users,  load average: 0.16, 26.97, 61.18
 11:11:43 up 14:05,  2 users,  load average: 0.16, 26.97, 61.18
 11:11:45 up 14:05,  2 users,  load average: 0.16, 26.97, 61.18
 11:12:06 up 14:06,  2 users,  load average: 0.10, 24.80, 59.56
 11:12:10 up 14:06,  2 users,  load average: 0.10, 24.80, 59.56
 11:14:30 up 14:08,  2 users,  load average: 0.01, 15.52, 51.21
 11:14:37 up 14:08,  2 users,  load average: 0.01, 15.00, 50.66
    
por ams 18.03.2014 / 17:45

1 resposta

5

Se você observar essas linhas na sua top output:

Mem:   1016284k total,  1008232k used,     8052k free,      580k buffers
Swap:  2096440k total,  2095168k used,     1272k free,     9872k cached

você ficou sem RAM e troca. Eu suspeito que se você assistir vmstat 10 output, verá que a máquina está morrendo de surra.

Uma máquina rodando MySQL e Apache não deve ter quase nenhum uso de swap. Eu suspeito que você precise alterar suas configurações do MySQL para corresponder à memória disponível (por exemplo, menos cache de consulta, menor pool de innodb, etc.). Também é possível que você precise diminuir o número máximo de filhos Apache permitidos. Ou talvez você tenha um script fugitivo (PHP, etc.) que esteja usando toneladas de memória (classifique seu topo por RSS).

    
por 18.03.2014 / 17:58