Travamento incomum do servidor no EC2, syslog tem linha de ^ @ ^ @ ^ @ ^ @ ^ @

4

Tentando entender por que um servidor ficou inativo por 20 minutos, examinei o syslog daquele período e vi o seguinte:

Jan  3 07:50:01 tools CRON[17085]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)
Jan  3 07:55:01 tools CRON[17773]: (munin) CMD (if [ -x /usr/bin/munin-cron ]; then /usr/bin/munin-cron; fi)
Jan  3 07:55:01 tools CRON[17774]: (root) CMD (if [ -x /etc/munin/plugins/apt_all ]; then /etc/munin/plugins/apt_all update 7200 12 >/dev/null; elif [ -x /etc/munin/plugins/apt ]; then /etc/munin/plugins/apt update 7200 12 >/dev/null; fi)
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@Jan  3 08:19:44 tools kernel: imklog 4.2.0, log source = /proc/kmsg started.
Jan  3 08:19:44 tools rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="470" x-info="http://www.rsyslog.com"] (re)start

O EC2 mostra essa utilização da CPU durante o tempo da falha (no meio, antes das 13:00 UTC).

Portanto, não há nada visível durante esse período de tempo. Nenhuma de nossas outras instâncias do EC2 falhou e não consigo encontrar nenhuma evidência de que foi um erro de aplicativo. De fato, isso aconteceu no nosso servidor de ferramentas (apache, mongodb e redis). O Monit também estava em execução, mas não há registros suspeitos no momento do acidente.

O que poderia ter causado essa falha e o que significam os ^@ no syslog?

    
por Reed G. Law 03.01.2012 / 17:29

1 resposta

1

A resposta veio no fórum da AWS: link

Reed,

There was a problem with the underlying hardware that caused the underlying system to crash. The artifact of the syslog lines, could be just a representation that the system was in mid write when it crashed, or the lines could have come from the the cron process that was running on the previous line.

Nathan

    
por 06.01.2012 / 01:11