Monit não será executado

2

Eu tenho duas instâncias idênticas do EC2 (a segunda é uma réplica da primeira), executando o Gentoo. A primeira instância tem monit running, que monitora um único processo e alguns recursos e funções do sistema.

Na segunda instância, monit é executado, mas sai imediatamente. A configuração é semelhante em ambas as instâncias, assim como as versões de monit.

monit.log mostra:

[GMT Oct  3 08:36:41] info     : monit daemon with PID 5 awakened

Linhas finais em strace monit show:

write(2, "monit daemon with PID 5 awakened"..., 33monit daemon with PID 5 awakened ) = 33
time(NULL)                              = 1349252827
open("/etc/localtime", O_RDONLY)        = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb773a000
read(4, "TZif2
monit: Debug: Adding host allow 'localhost'
monit: Debug: Skipping redundant host 'localhost'
monit: Debug: Skipping redundant host 'localhost'
monit: Debug: Adding credentials for user 'xxxx'.
Runtime constants:
 Control file       = /etc/monitrc
 Log file           = /var/log/monit/monit.log
 Pid file           = /var/run/monit.pid
 Id file            = /var/run/monit.pid
 Debug              = True
 Log                = True
 Use syslog         = False
 Is Daemon          = True
 Use process engine = True
 Poll time          = 30 seconds with start delay 0 seconds
 Expect buffer      = 256 bytes
 Event queue        = base directory /var/monit with 100 slots
 Mail server(s)     = xx.xxx.xx.xxx with timeout 30 seconds
 Mail from          = (not defined)
 Mail subject       = (not defined)
 Mail message       = (not defined)
 Start monit httpd  = True
 httpd bind address = Any/All
 httpd portnumber   = 2812
 httpd signature    = True
 Use ssl encryption = False
 httpd auth. style  = Basic Authentication and Host/Net allow list
 Alert mail to      = [email protected]
   Alert on         = All events

The service list contains the following entries:

System Name           = xxxx
 Monitoring mode      = active
 CPU wait limit       = if greater than 20.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 CPU system limit     = if greater than 30.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 CPU user limit       = if greater than 70.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 Swap usage limit     = if greater than 25.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 Memory usage limit   = if greater than 75.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 Load avg. (5min)     = if greater than 2.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 Load avg. (1min)     = if greater than 4.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert

Process Name          = xxxx
 Group                = server
 Pid file             = /var/run/xxxx.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/xxxx restart' timeout 20 second(s)
 Stop program         = '/etc/init.d/xxxx stop' timeout 30 second(s)
 Existence            = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Timeout              = If restarted 3 times within 5 cycle(s) then unmonitor
 Alert mail to        = [email protected]
   Alert on           = All events
 Alert mail to        = [email protected]
   Alert on           = All events

-------------------------------------------------------------------------------
monit daemon with PID 5 awakened
[GMT Oct  3 08:36:41] info     : monit daemon with PID 5 awakened
write(2, "monit daemon with PID 5 awakened"..., 33monit daemon with PID 5 awakened ) = 33
time(NULL)                              = 1349252827
open("/etc/localtime", O_RDONLY)        = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb773a000
read(4, "TZif2
monit: Debug: Adding host allow 'localhost'
monit: Debug: Skipping redundant host 'localhost'
monit: Debug: Skipping redundant host 'localhost'
monit: Debug: Adding credentials for user 'xxxx'.
Runtime constants:
 Control file       = /etc/monitrc
 Log file           = /var/log/monit/monit.log
 Pid file           = /var/run/monit.pid
 Id file            = /var/run/monit.pid
 Debug              = True
 Log                = True
 Use syslog         = False
 Is Daemon          = True
 Use process engine = True
 Poll time          = 30 seconds with start delay 0 seconds
 Expect buffer      = 256 bytes
 Event queue        = base directory /var/monit with 100 slots
 Mail server(s)     = xx.xxx.xx.xxx with timeout 30 seconds
 Mail from          = (not defined)
 Mail subject       = (not defined)
 Mail message       = (not defined)
 Start monit httpd  = True
 httpd bind address = Any/All
 httpd portnumber   = 2812
 httpd signature    = True
 Use ssl encryption = False
 httpd auth. style  = Basic Authentication and Host/Net allow list
 Alert mail to      = [email protected]
   Alert on         = All events

The service list contains the following entries:

System Name           = xxxx
 Monitoring mode      = active
 CPU wait limit       = if greater than 20.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 CPU system limit     = if greater than 30.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 CPU user limit       = if greater than 70.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 Swap usage limit     = if greater than 25.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 Memory usage limit   = if greater than 75.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 Load avg. (5min)     = if greater than 2.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert
 Load avg. (1min)     = if greater than 4.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert

Process Name          = xxxx
 Group                = server
 Pid file             = /var/run/xxxx.pid
 Monitoring mode      = active
 Start program        = '/etc/init.d/xxxx restart' timeout 20 second(s)
 Stop program         = '/etc/init.d/xxxx stop' timeout 30 second(s)
 Existence            = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert
 Pid                  = if changed 1 times within 1 cycle(s) then alert
 Ppid                 = if changed 1 times within 1 cycle(s) then alert
 Timeout              = If restarted 3 times within 5 cycle(s) then unmonitor
 Alert mail to        = [email protected]
   Alert on           = All events
 Alert mail to        = [email protected]
   Alert on           = All events

-------------------------------------------------------------------------------
monit daemon with PID 5 awakened
%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%"..., 4096) = 118 _llseek(4, -6, [112], SEEK_CUR) = 0 read(4, "\nGMT0\n", 4096) = 6 close(4) = 0 munmap(0xb773a000, 4096) = 0 write(3, "[GMT Oct 3 08:27:07] info :"..., 33) = 33 write(3, "monit daemon with PID 5 awakened"..., 33) = 33 waitpid(-1, NULL, WNOHANG) = -1 ECHILD (No child processes) close(3) = 0 exit_group(0) = ?
%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%%pre%"..., 4096) = 118 _llseek(4, -6, [112], SEEK_CUR) = 0 read(4, "\nGMT0\n", 4096) = 6 close(4) = 0 munmap(0xb773a000, 4096) = 0 write(3, "[GMT Oct 3 08:27:07] info :"..., 33) = 33 write(3, "monit daemon with PID 5 awakened"..., 33) = 33 waitpid(-1, NULL, WNOHANG) = -1 ECHILD (No child processes) close(3) = 0 exit_group(0) = ?

Nenhum dump principal ( ulimit -c mostra unlimited )

monit -v mostra:

%pre%

Executou emerge --sync antes de emerge -va monit que instalou o monit v5.3.2. Quando isso não funcionou eu baixei v5.5 de seu site e compilado da fonte que não funcionava também.

    
por Yaniro 03.10.2012 / 10:56

2 respostas

0

Descobriu qual era o problema:

O arquivo /var/run/monit.pid continha algumas informações estranhas, parecia um md5 hash por algum motivo.

Uma vez que eu removi este arquivo e reiniciei o monitor, tudo funcionou bem. O que alertou minha suspeita foi a linha monit daemon with PID 5 awakened que era muito estranha porque um processo recém-executado deveria ter um PID muito mais alto, então eu fui checar o arquivo .pid.

    
por 03.10.2012 / 11:35
0

Eu carreguei o Monit em uma instância EC2 apoiada pelo EBS (usando a compilação Linux padrão da AWS).

Monit não foi confiável para mim ao iniciar / reiniciar devido a um erro de configuração da minha parte. No arquivo de configuração /etc/monit.conf eu tinha a seguinte linha, que estava incorreta:

set idfile /var/run/monit.pid

Isso deveria ter sido

set idfile /var/run/.monit.id

Alguém aí pode achar isso útil no futuro.

    
por 20.01.2015 / 12:40