centos servidor dedicado não responde pela primeira vez

1

o servidor não respondeu por uma hora, então eu o reiniciei e verifiquei / var / log / messages

e achei isso. Alguém pode apontar o que está errado?

Sep 28 07:39:35 www kernel: INFO: task mysqld:22749 blocked for more than 120 seconds.
Sep 28 07:39:35 www kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 07:39:35 www kernel: mysqld        D ffff810001015120     0 22749   3266         22792 22659 (NOTLB)
Sep 28 07:39:35 www kernel:  ffff810139d21e58 0000000000000086 ffff810036217000 ffffffff8000f758
Sep 28 07:39:35 www kernel:  ffff81020dfd1408 0000000000000007 ffff8101cfbaf7e0 ffff81020fca5080
Sep 28 07:39:35 www kernel:  00017a451524782a 00000000000043b2 ffff8101cfbaf9c8 0000000280009a22
Sep 28 07:39:35 www kernel: Call Trace:
Sep 28 07:39:35 www kernel:  [<ffffffff8000f758>] generic_permission+0x52/0xca
Sep 28 07:39:35 www kernel:  [<ffffffff80063c63>] __mutex_lock_slowpath+0x60/0x9b
Sep 28 07:39:35 www kernel:  [<ffffffff8000cea2>] do_path_lookup+0x294/0x310
Sep 28 07:39:35 www kernel:  [<ffffffff80063cad>] .text.lock.mutex+0xf/0x14
Sep 28 07:39:35 www kernel:  [<ffffffff8003c618>] do_unlinkat+0x66/0x141
Sep 28 07:39:35 www kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 07:39:57 www kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 07:39:58 www kernel: 
Sep 28 07:39:59 www kernel: INFO: task httpd:22679 blocked for more than 120 seconds.
Sep 28 07:40:04 www kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 07:40:08 www kernel: httpd         D ffff81000100caa0     0 22679  22413         22680 22678 (NOTLB)
Sep 28 07:40:51 www kernel:  ffff81018b0dbc78 0000000000000086 ffff81018b0dbc88 0000004480063002
Sep 28 07:41:52 www kernel:  ffff81000001cc00 0000000000000007 ffff81013ac5e860 ffff81020fc96100
Sep 28 07:43:10 www kernel:  00017a44de6376c8 000000000000a89f ffff81013ac5ea48 000000010001cc00
Sep 28 07:43:38 www kernel: Call Trace:
Sep 28 07:44:06 www kernel:  [<ffffffff80063c63>] __mutex_lock_slowpath+0x60/0x9b
Sep 28 07:44:09 www kernel:  [<ffffffff80063cad>] .text.lock.mutex+0xf/0x14
Sep 28 07:44:10 www kernel:  [<ffffffff8000d0b2>] do_lookup+0x90/0x1e6
Sep 28 07:44:13 www kernel:  [<ffffffff8000a2e9>] __link_path_walk+0xa3a/0xfd1
Sep 28 07:44:16 www kernel:  [<ffffffff8000eb8e>] link_path_walk+0x45/0xb8
Sep 28 07:44:16 www kernel:  [<ffffffff8000cea2>] do_path_lookup+0x294/0x310
Sep 28 07:44:29 www kernel:  [<ffffffff800129ad>] getname+0x15b/0x1c2
Sep 28 07:44:38 www kernel:  [<ffffffff80023b60>] __user_walk_fd+0x37/0x4c
Sep 28 07:44:42 www kernel:  [<ffffffff80028ada>] vfs_stat_fd+0x1b/0x4a
Sep 28 07:44:43 www kernel:  [<ffffffff8003c69a>] do_unlinkat+0xe8/0x141
Sep 28 07:45:02 www kernel:  [<ffffffff80023890>] sys_newstat+0x19/0x31
Sep 28 07:46:18 www kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 07:46:43 www kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 07:46:55 www kernel: 
Sep 28 07:46:58 www kernel: INFO: task php:28906 blocked for more than 120 seconds.
Sep 28 07:46:59 www kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 07:47:00 www kernel: php           D ffff810165127000     0 28906  28905                     (NOTLB)
Sep 28 07:47:37 www kernel:  ffff810078431e58 0000000000000082 ffff810165127000 ffffffff8000f758
Sep 28 07:48:29 www kernel:  ffff81020dfd1408 0000000000000007 ffff8101247b9860 ffff810207d0e100
Sep 28 07:48:36 www kernel:  00017a4218932fae 0000000000377111 ffff8101247b9a48 0000000280009a22
Sep 28 07:48:37 www kernel: Call Trace:
Sep 28 07:48:37 www kernel:  [<ffffffff8000f758>] generic_permission+0x52/0xca
Sep 28 07:48:37 www kernel:  [<ffffffff80063c63>] __mutex_lock_slowpath+0x60/0x9b
Sep 28 07:48:37 www kernel:  [<ffffffff8000cea2>] do_path_lookup+0x294/0x310
Sep 28 07:48:41 www kernel:  [<ffffffff80063cad>] .text.lock.mutex+0xf/0x14
Sep 28 07:48:41 www kernel:  [<ffffffff8003c618>] do_unlinkat+0x66/0x141
Sep 28 07:48:42 www kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 07:48:42 www kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 07:48:42 www kernel: 
Sep 28 07:48:43 www kernel: INFO: task php:29032 blocked for more than 120 seconds.
Sep 28 07:48:45 www kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 07:48:46 www kernel: php           D 0000000000000004     0 29032      1         29050 29024 (NOTLB)
Sep 28 07:48:46 www kernel:  ffff81006b465dc8 0000000000000086 ffff81020dfd1408 ffffffff80009a22
Sep 28 07:48:46 www kernel:  0000000000000000 0000000000000007 ffff81002946e860 ffff81003c943100
Sep 28 07:48:46 www kernel:  00017a4211450766 000000000024be3d ffff81002946ea48 000000020e42b300
Sep 28 07:48:52 www kernel: Call Trace:
Sep 28 07:48:54 www kernel:  [<ffffffff80009a22>] __link_path_walk+0x173/0xfd1
Sep 28 07:48:54 www kernel:  [<ffffffff8002cc58>] mntput_no_expire+0x19/0x89
Sep 28 07:48:55 www kernel:  [<ffffffff8000ebf5>] link_path_walk+0xac/0xb8
Sep 28 07:48:55 www kernel:  [<ffffffff80063c63>] __mutex_lock_slowpath+0x60/0x9b
Sep 28 07:48:55 www kernel:  [<ffffffff80023974>] __path_lookup_intent_open+0x56/0x97
Sep 28 07:48:55 www kernel:  [<ffffffff80063cad>] .text.lock.mutex+0xf/0x14
Sep 28 07:48:55 www kernel:  [<ffffffff8001b260>] open_namei+0xea/0x718
Sep 28 07:48:59 www kernel:  [<ffffffff80067235>] do_page_fault+0x4cc/0x842
Sep 28 07:49:01 www kernel:  [<ffffffff80027726>] do_filp_open+0x1c/0x38
Sep 28 07:49:01 www kernel:  [<ffffffff8001a09c>] do_sys_open+0x44/0xbe
Sep 28 07:49:02 www kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 07:49:03 www kernel: 
Sep 28 07:49:07 www kernel: INFO: task mysqld:22749 blocked for more than 120 seconds.
Sep 28 07:49:09 www kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 07:49:09 www kernel: mysqld        D ffff810001015120     0 22749   3266         22792 22659 (NOTLB)
Sep 28 07:49:14 www kernel:  ffff810139d21e58 0000000000000086 ffff810036217000 ffffffff8000f758
Sep 28 07:49:14 www kernel:  ffff81020dfd1408 0000000000000007 ffff8101cfbaf7e0 ffff81020fca5080
Sep 28 07:49:15 www kernel:  00017a451524782a 00000000000043b2 ffff8101cfbaf9c8 0000000280009a22
Sep 28 07:49:15 www kernel: Call Trace:
Sep 28 07:49:22 www kernel:  [<ffffffff8000f758>] generic_permission+0x52/0xca
Sep 28 07:49:23 www kernel:  [<ffffffff80063c63>] __mutex_lock_slowpath+0x60/0x9b
Sep 28 07:49:23 www kernel:  [<ffffffff8000cea2>] do_path_lookup+0x294/0x310
Sep 28 07:49:23 www kernel:  [<ffffffff80063cad>] .text.lock.mutex+0xf/0x14
Sep 28 07:49:23 www kernel:  [<ffffffff8003c618>] do_unlinkat+0x66/0x141
Sep 28 07:49:23 www kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 07:49:23 www kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 07:49:23 www kernel: 
Sep 28 07:49:23 www kernel: INFO: task php:29024 blocked for more than 120 seconds.
Sep 28 07:49:23 www kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 07:49:24 www kernel: php           D ffff8101920a0000     0 29024      1         29032 29001 (NOTLB)
Sep 28 07:49:26 www kernel:  ffff8101cca8fe58 0000000000000086 ffff8101920a0000 ffffffff8000f758
Sep 28 07:49:26 www kernel:  ffff81020dfd1408 0000000000000007 ffff81000b64b040 ffff8101e05337e0
Sep 28 07:49:26 www kernel:  00017a552aef9f35 0000000000009513 ffff81000b64b228 0000000180009a22
Sep 28 07:49:27 www kernel: Call Trace:
Sep 28 07:49:27 www kernel:  [<ffffffff8000f758>] generic_permission+0x52/0xca
Sep 28 07:49:27 www kernel:  [<ffffffff80063c63>] __mutex_lock_slowpath+0x60/0x9b
Sep 28 07:49:27 www kernel:  [<ffffffff8000cea2>] do_path_lookup+0x294/0x310
Sep 28 07:49:27 www kernel:  [<ffffffff80063cad>] .text.lock.mutex+0xf/0x14
Sep 28 07:49:27 www kernel:  [<ffffffff8003c618>] do_unlinkat+0x66/0x141
Sep 28 07:49:27 www kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 07:49:27 www kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 07:49:27 www kernel: 
Sep 28 07:49:27 www kernel: INFO: task php:29050 blocked for more than 120 seconds.
Sep 28 07:49:28 www kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 07:49:28 www kernel: php           D ffff810201d95000     0 29050      1               29032 (NOTLB)
Sep 28 07:49:28 www kernel:  ffff810051e45e58 0000000000000086 ffff810201d95000 ffffffff8000f758
Sep 28 07:49:28 www kernel:  ffff81020dfd1408 0000000000000007 ffff81001c23f080 ffff81020f5e2080
Sep 28 07:49:29 www kernel:  00017a5d0bc2aa75 0000000000d0ecfe ffff81001c23f268 0000000280009a22
Sep 28 07:49:29 www kernel: Call Trace:
Sep 28 07:49:29 www kernel:  [<ffffffff8000f758>] generic_permission+0x52/0xca
Sep 28 07:49:29 www kernel:  [<ffffffff80063c63>] __mutex_lock_slowpath+0x60/0x9b
Sep 28 07:49:29 www kernel:  [<ffffffff8000cea2>] do_path_lookup+0x294/0x310
Sep 28 07:49:34 www kernel:  [<ffffffff80063cad>] .text.lock.mutex+0xf/0x14
Sep 28 07:49:35 www kernel:  [<ffffffff8003c618>] do_unlinkat+0x66/0x141
Sep 28 07:49:37 www kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 07:49:37 www kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 07:49:37 www kernel: 
Sep 28 07:49:37 www kernel: INFO: task php:29064 blocked for more than 120 seconds.
Sep 28 07:49:37 www kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 07:49:37 www kernel: php           D ffff81009c231000     0 29064  29057                     (NOTLB)
Sep 28 07:49:38 www kernel:  ffff8100a5dc7e58 0000000000000086 ffff81009c231000 ffffffff8000f758
Sep 28 07:49:38 www kernel:  ffff81020dfd1408 0000000000000007 ffff81000a850820 ffff8102038037a0
Sep 28 07:49:38 www kernel:  00017a5bb5c6846e 000000000000861a ffff81000a850a08 0000000080009a22
Sep 28 07:49:38 www kernel: Call Trace:
Sep 28 07:49:38 www kernel:  [<ffffffff8000f758>] generic_permission+0x52/0xca
Sep 28 07:49:38 www kernel:  [<ffffffff80063c63>] __mutex_lock_slowpath+0x60/0x9b
Sep 28 07:49:38 www kernel:  [<ffffffff8000cea2>] do_path_lookup+0x294/0x310
Sep 28 07:49:38 www kernel:  [<ffffffff80063cad>] .text.lock.mutex+0xf/0x14
Sep 28 07:49:38 www kernel:  [<ffffffff8003c618>] do_unlinkat+0x66/0x141
Sep 28 07:49:38 www kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 07:49:40 www kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 07:49:42 www kernel: 
Sep 28 07:49:42 www kernel: INFO: task mysqld:24612 blocked for more than 120 seconds.
Sep 28 07:49:43 www kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 07:49:46 www kernel: mysqld        D ffff81020dfd14c0     0 24612   3266         19643  3599 (NOTLB)
Sep 28 07:49:46 www kernel:  ffff81019e517c78 0000000000000086 ffff81019e517c88 ffffffff80063002
Sep 28 07:49:47 www kernel:  ffff810201966558 0000000000000009 ffff81015fa560c0 ffff8101c263b860
Sep 28 07:49:51 www kernel:  00017a9d113e27fe 0000000000008d5a ffff81015fa562a8 000000018006ec9f
Sep 28 07:49:52 www kernel: Call Trace:
Sep 28 07:49:52 www kernel:  [<ffffffff80063002>] thread_return+0x62/0xfe
Sep 28 07:49:52 www kernel:  [<ffffffff8005a46a>] getnstimeofday+0x10/0x29
Sep 28 07:49:53 www kernel:  [<ffffffff80063c63>] __mutex_lock_slowpath+0x60/0x9b
Sep 28 07:49:54 www kernel:  [<ffffffff80063cad>] .text.lock.mutex+0xf/0x14
Sep 28 07:49:54 www kernel:  [<ffffffff8000d0b2>] do_lookup+0x90/0x1e6
Sep 28 07:49:56 www kernel:  [<ffffffff8000a2e9>] __link_path_walk+0xa3a/0xfd1
Sep 28 07:50:00 www kernel:  [<ffffffff8000eb8e>] link_path_walk+0x45/0xb8
Sep 28 07:50:03 www kernel:  [<ffffffff8000cea2>] do_path_lookup+0x294/0x310
Sep 28 07:50:04 www kernel:  [<ffffffff800129ad>] getname+0x15b/0x1c2
Sep 28 07:50:06 www kernel:  [<ffffffff80023b60>] __user_walk_fd+0x37/0x4c
Sep 28 07:50:06 www kernel:  [<ffffffff8003f013>] vfs_lstat_fd+0x18/0x47
Sep 28 07:50:08 www kernel:  [<ffffffff8002ad91>] sys_newlstat+0x19/0x31
Sep 28 07:50:10 www kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 07:50:15 www kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 07:50:19 www kernel: 
Sep 28 07:50:19 www kernel: INFO: task php:29178 blocked for more than 120 seconds.
Sep 28 07:50:23 www kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 07:50:23 www kernel: php           D 0000000000000003     0 29178  29123                     (NOTLB)
Sep 28 07:50:23 www kernel:  ffff81004a95bdc8 0000000000000086 ffff81020dfd1408 ffffffff80009a22
Sep 28 07:50:24 www kernel:  ffffffff800a2fd0 0000000000000007 ffff8101937a4040 ffff81010bde27a0
Sep 28 07:50:26 www kernel:  00017aa3a1d89c9b 000000000000d66e ffff8101937a4228 000000020e42b300
Sep 28 07:50:26 www kernel: Call Trace:
Sep 28 07:50:26 www kernel:  [<ffffffff80009a22>] __link_path_walk+0x173/0xfd1
Sep 28 07:50:27 www kernel:  [<ffffffff800a2fd0>] wake_bit_function+0x0/0x23
Sep 28 07:50:27 www kernel:  [<ffffffff8002cc58>] mntput_no_expire+0x19/0x89
Sep 28 07:50:27 www kernel:  [<ffffffff8000ebf5>] link_path_walk+0xac/0xb8
Sep 28 07:50:28 www kernel:  [<ffffffff80063c63>] __mutex_lock_slowpath+0x60/0x9b
Sep 28 07:50:32 www kernel:  [<ffffffff80023974>] __path_lookup_intent_open+0x56/0x97
Sep 28 07:50:32 www kernel:  [<ffffffff80063cad>] .text.lock.mutex+0xf/0x14
Sep 28 07:50:34 www kernel:  [<ffffffff8001b260>] open_namei+0xea/0x718
Sep 28 07:50:34 www kernel:  [<ffffffff80067235>] do_page_fault+0x4cc/0x842
Sep 28 07:50:35 www kernel:  [<ffffffff80027726>] do_filp_open+0x1c/0x38
Sep 28 07:50:35 www kernel:  [<ffffffff8001a09c>] do_sys_open+0x44/0xbe
Sep 28 07:50:35 www kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 07:50:35 www kernel: 
Sep 28 07:56:41 www kernel: proftpd invoked oom-killer: gfp_mask=0x201d2, order=0, oomkilladj=0
Sep 28 07:56:41 www kernel: 
Sep 28 07:56:41 www kernel: Call Trace:
Sep 28 07:56:41 www kernel:  [<ffffffff800c9f35>] out_of_memory+0x8e/0x2f3
Sep 28 07:56:41 www kernel:  [<ffffffff800a2fa2>] autoremove_wake_function+0x0/0x2e
Sep 28 07:56:41 www kernel:  [<ffffffff8000f67d>] __alloc_pages+0x27f/0x308
Sep 28 07:56:41 www kernel:  [<ffffffff80013047>] __do_page_cache_readahead+0x96/0x17b
Sep 28 07:56:41 www kernel:  [<ffffffff80013984>] filemap_nopage+0x14c/0x360
Sep 28 07:56:41 www kernel:  [<ffffffff80008972>] __handle_mm_fault+0x1fd/0x103b
Sep 28 07:56:41 www kernel:  [<ffffffff800a4fe1>] ktime_get_ts+0x1a/0x4e
Sep 28 07:56:41 www kernel:  [<ffffffff80067202>] do_page_fault+0x499/0x842
Sep 28 07:56:41 www kernel:  [<ffffffff8003ad91>] hrtimer_try_to_cancel+0x4a/0x53
Sep 28 07:58:10 www kernel:  [<ffffffff80033541>] do_setitimer+0xd0/0x689
Sep 28 08:26:22 www syslogd 1.4.1: restart.
Sep 28 08:26:22 www kernel: klogd 1.4.1, log source = /proc/kmsg started.
Sep 28 08:26:22 www kernel: Linux version 2.6.18-274.17.1.el5 ([email protected]) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-51)) #1 SMP Tue Jan 10 17:25:58 EST 2012
Sep 28 08:26:22 www kernel: Command line: ro root=LABEL=/
    
por AMB 28.09.2012 / 16:13

2 respostas

3

Como Safado diz, você ficou sem memória, resultando em chutes e fechando as coisas. Eu também tive esse problema .

Eu tomei as seguintes ações:

  • Maior quantidade de troca disponível, então o oom-killer não seria chamado tão rapidamente
  • Configurar monit para me alertar quando a memória começar a ficar
  • Configure munin para verificar o uso de memória e ver tendências

Isso me permitiu acessar o servidor quando as coisas estavam começando a parecer instáveis e verificar o que estava usando toda a memória.

No meu caso, era o Apache. Eu o reconfigurei para reduzir o número de threads e servidores de reposição, e os problemas desapareceram.

O ponto principal é quando algo assim acontece com você, o monitoramento realmente ajudará.

    
por 28.09.2012 / 16:52
2

Você tem algum software de monitoramento de servidores que mantém registros dos sinais vitais do servidor? Em particular, a RAM? às 07:56:41, o oom-killer era chamado, o que significa que você provavelmente estava sem memória, o que poderia explicar os problemas que você estava tendo com o httpd e o mysqld, causando o comportamento que não responde.

    
por 28.09.2012 / 16:41