Mysterious Debian 8 reinicializações aleatórias

1

Eu tenho um servidor Debian 8 que reinicia aleatoriamente. Eu tentei procurar logs de journalctl de botas anteriores (logs são persistentes) mas não encontrei nada:

$ journalctl -b -1 -e

Tentei passar por todos os logs (reiniciar, fechar, fechar, reiniciar, pânico) - nada de útil:

$ grep -rn "reboot" /var/log

Consegui reproduzi-lo em alguns nós do GCP e OVH (VPS, dedicado). Embora alguns dos nós com a configuração semelhante funcionem bem.

$ last reboot
reboot   system boot  3.16.0-4-amd64   Mon May 29 13:20 - 14:21  (01:00)
reboot   system boot  3.16.0-4-amd64   Mon May 29 13:11 - 14:21  (01:10)
reboot   system boot  3.16.0-4-amd64   Mon May 29 13:06 - 14:21  (01:15)
reboot   system boot  3.16.0-4-amd64   Mon May 29 12:58 - 14:21  (01:23)
reboot   system boot  3.16.0-4-amd64   Mon May 29 10:53 - 14:21  (03:28)
reboot   system boot  3.16.0-4-amd64   Mon May 29 09:51 - 10:52  (01:01)
reboot   system boot  3.16.0-4-amd64   Sun May 28 20:29 - 10:52  (14:23)
reboot   system boot  3.16.0-4-amd64   Sun May 28 20:01 - 10:52  (14:51)
reboot   system boot  3.16.0-4-amd64   Sun May 28 18:45 - 10:52  (16:07)
reboot   system boot  3.16.0-4-amd64   Sun May 28 18:36 - 10:52  (16:16)
reboot   system boot  3.16.0-4-amd64   Sun May 28 18:19 - 10:52  (16:33)
reboot   system boot  3.16.0-4-amd64   Sun May 28 17:51 - 10:52  (17:01)
reboot   system boot  3.16.0-4-amd64   Sun May 28 10:20 - 10:52 (1+00:31)
reboot   system boot  3.16.0-4-amd64   Sun May 28 09:04 - 10:52 (1+01:48)
reboot   system boot  3.16.0-4-amd64   Sun May 28 08:54 - 10:52 (1+01:58)
reboot   system boot  3.16.0-4-amd64   Sun May 28 08:48 - 10:52 (1+02:03)
reboot   system boot  3.16.0-4-amd64   Sun May 28 08:42 - 10:52 (1+02:10)
reboot   system boot  3.16.0-4-amd64   Sun May 28 08:35 - 10:52 (1+02:17)
reboot   system boot  3.16.0-4-amd64   Sun May 28 08:18 - 10:52 (1+02:34)
reboot   system boot  3.16.0-4-amd64   Sun May 28 08:12 - 10:52 (1+02:40)
reboot   system boot  3.16.0-4-amd64   Sun May 28 05:34 - 10:52 (1+05:18)
reboot   system boot  3.16.0-4-amd64   Sun May 28 01:03 - 10:52 (1+09:49)
reboot   system boot  3.16.0-4-amd64   Sun May 28 01:00 - 10:52 (1+09:52)
reboot   system boot  3.16.0-4-amd64   Sat May 27 23:20 - 10:52 (1+11:32)
reboot   system boot  3.16.0-4-amd64   Sat May 27 21:22 - 10:52 (1+13:30)
reboot   system boot  3.16.0-4-amd64   Sat May 27 21:17 - 10:52 (1+13:35)
reboot   system boot  3.16.0-4-amd64   Sat May 27 20:52 - 10:52 (1+14:00)
reboot   system boot  3.16.0-4-amd64   Sat May 27 19:32 - 10:52 (1+15:20)
reboot   system boot  3.16.0-4-amd64   Sat May 27 18:07 - 10:52 (1+16:45)
reboot   system boot  3.16.0-4-amd64   Sat May 27 17:52 - 10:52 (1+17:00)
reboot   system boot  3.16.0-4-amd64   Sat May 27 16:32 - 10:52 (1+18:20)
reboot   system boot  3.16.0-4-amd64   Sat May 27 12:25 - 10:52 (1+22:27)
reboot   system boot  3.16.0-4-amd64   Sat May 27 12:16 - 10:52 (1+22:36)
reboot   system boot  3.16.0-4-amd64   Sat May 27 11:07 - 10:52 (1+23:45)
reboot   system boot  3.16.0-4-amd64   Sat May 27 09:53 - 10:52 (2+00:59)
reboot   system boot  3.16.0-4-amd64   Sat May 27 09:09 - 10:52 (2+01:43)
reboot   system boot  3.16.0-4-amd64   Sat May 27 06:39 - 10:52 (2+04:13)
reboot   system boot  3.16.0-4-amd64   Sat May 27 06:06 - 10:52 (2+04:46)
reboot   system boot  3.16.0-4-amd64   Sat May 27 05:00 - 10:52 (2+05:52)
reboot   system boot  3.16.0-4-amd64   Sat May 27 04:53 - 10:52 (2+05:58)
reboot   system boot  3.16.0-4-amd64   Sat May 27 03:40 - 10:52 (2+07:12)
reboot   system boot  3.16.0-4-amd64   Sat May 27 01:57 - 10:52 (2+08:55)
reboot   system boot  3.16.0-4-amd64   Sat May 27 01:13 - 10:52 (2+09:39)
reboot   system boot  3.16.0-4-amd64   Fri May 26 22:51 - 10:52 (2+12:01)
reboot   system boot  3.16.0-4-amd64   Fri May 26 20:54 - 10:52 (2+13:58)
reboot   system boot  3.16.0-4-amd64   Fri May 26 16:50 - 10:52 (2+18:02)
reboot   system boot  3.16.0-4-amd64   Fri May 26 15:58 - 10:52 (2+18:54)
reboot   system boot  3.16.0-4-amd64   Fri May 26 15:21 - 10:52 (2+19:31)
reboot   system boot  3.16.0-4-amd64   Fri May 26 14:41 - 10:52 (2+20:11)
reboot   system boot  3.16.0-4-amd64   Fri May 26 13:23 - 10:52 (2+21:29)
reboot   system boot  3.16.0-4-amd64   Fri May 26 11:44 - 10:52 (2+23:08)
reboot   system boot  3.16.0-4-amd64   Fri May 26 10:55 - 10:52 (2+23:57)
reboot   system boot  3.16.0-4-amd64   Fri May 26 10:36 - 10:52 (3+00:16)
reboot   system boot  3.16.0-4-amd64   Fri May 26 10:12 - 10:52 (3+00:40)
reboot   system boot  3.16.0-4-amd64   Fri May 26 08:27 - 10:52 (3+02:25)
reboot   system boot  3.16.0-4-amd64   Fri May 26 08:25 - 10:52 (3+02:27)
reboot   system boot  3.16.0-4-amd64   Fri May 26 08:17 - 10:52 (3+02:35)
reboot   system boot  3.16.0-4-amd64   Fri May 26 06:45 - 10:52 (3+04:07)
reboot   system boot  3.16.0-4-amd64   Fri May 26 04:53 - 10:52 (3+05:59)
reboot   system boot  3.16.0-4-amd64   Fri May 26 04:23 - 10:52 (3+06:29)
reboot   system boot  3.16.0-4-amd64   Thu May 25 16:25 - 10:52 (3+18:27)
reboot   system boot  3.16.0-4-amd64   Thu May 25 16:01 - 10:52 (3+18:51)
reboot   system boot  3.16.0-4-amd64   Thu May 25 15:41 - 10:52 (3+19:11)
reboot   system boot  3.16.0-4-amd64   Thu May 25 15:24 - 10:52 (3+19:28)
reboot   system boot  3.16.0-4-amd64   Thu May 25 15:10 - 10:52 (3+19:42)
reboot   system boot  3.16.0-4-amd64   Thu May 25 14:10 - 10:52 (3+20:42)
reboot   system boot  3.16.0-4-amd64   Thu May 25 13:54 - 10:52 (3+20:58)
reboot   system boot  3.16.0-4-amd64   Thu May 25 13:31 - 10:52 (3+21:21)
reboot   system boot  3.16.0-4-amd64   Thu May 25 13:20 - 10:52 (3+21:32)
reboot   system boot  3.16.0-4-amd64   Thu May 25 13:03 - 10:52 (3+21:49)
reboot   system boot  3.16.0-4-amd64   Thu May 25 12:42 - 10:52 (3+22:10)
reboot   system boot  3.16.0-4-amd64   Thu May 25 11:52 - 10:52 (3+23:00)
reboot   system boot  3.16.0-4-amd64   Thu May 25 11:44 - 10:52 (3+23:08)
reboot   system boot  3.16.0-4-amd64   Thu May 25 11:24 - 10:52 (3+23:28)
reboot   system boot  3.16.0-4-amd64   Thu May 25 07:17 - 10:52 (4+03:35)
reboot   system boot  3.16.0-4-amd64   Wed May 24 04:42 - 10:52 (5+06:10)
reboot   system boot  3.16.0-4-amd64   Wed May 24 04:37 - 04:42  (00:05)

É super estranho que não haja nada nos logs sugerindo quem desencadeou o reinício, nenhum pânico no kernel.

Eu tentei substituir /sbin/shutdown como sugerido em Servidor reiniciando misteriosamente mas parece que ninguém roda isso.

Journalctl registra logo após a reinicialização: link

Por favor, sugira como posso depurar ainda mais.

    
por csandanov 29.05.2017 / 16:58

1 resposta

0

As reinicializações aleatórias foram causadas pelo kernel panic, embora não houvesse logs sugerindo isso. Depois de instalar e configurar o kdump, obtive um rastreamento de pilha que me ajudou a identificar o problema. Não é óbvio.

    
por 01.06.2017 / 12:25

Tags