Eu verifico meu servidor na WAN com ping e https pelo Nagios Core 3.5.1.
Aqui está o histórico de alertas do host.
June 23, 2015 18:00
Service Ok[06-23-2015 18:13:47] SERVICE ALERT: webserver;PING;OK;HARD;3;PING OK - Packet loss = 0%, RTA = 33.72 ms
Service Ok[06-23-2015 18:13:40] SERVICE ALERT: webserver;HTTPS;OK;HARD;3;HTTP OK: HTTP/1.1 200 OK - 359 bytes in 0.201 second response time
Host Up[06-23-2015 18:06:29] HOST ALERT: webserver;UP;SOFT;8;PING OK - Packet loss = 0%, RTA = 33.92 ms
Host Down[06-23-2015 18:05:25] HOST ALERT: webserver;DOWN;SOFT;7;CRITICAL - Time to live exceeded (1.2.)
Host Down[06-23-2015 18:04:19] HOST ALERT: webserver;DOWN;SOFT;6;PING CRITICAL - Packet loss = 100%
Service Critical[06-23-2015 18:03:53] SERVICE ALERT: webserver;PING;CRITICAL;HARD;3;PING CRITICAL - Packet loss = 100%
Host Down[06-23-2015 18:03:49] HOST ALERT: webserver;DOWN;SOFT;5;PING CRITICAL - Packet loss = 100%
Service Critical[06-23-2015 18:03:49] SERVICE ALERT: webserver;HTTPS;CRITICAL;HARD;3;CRITICAL - Socket timeout after 10 seconds
Host Down[06-23-2015 18:02:19] HOST ALERT: webserver;DOWN;SOFT;4;(Host check timed out after 30.01 seconds)
Service Critical[06-23-2015 18:01:53] SERVICE ALERT: webserver;PING;CRITICAL;SOFT;2;PING CRITICAL - Packet loss = 100%
Service Critical[06-23-2015 18:01:49] SERVICE ALERT: webserver;HTTPS;CRITICAL;SOFT;2;CRITICAL - Socket timeout after 10 seconds
Host Down[06-23-2015 18:01:48] HOST ALERT: webserver;DOWN;SOFT;3;(Host check timed out after 30.01 seconds)
Host Down[06-23-2015 18:00:18] HOST ALERT: webserver;DOWN;SOFT;2;PING CRITICAL - Packet loss = 100%
June 23, 2015 17:00
Service Critical[06-23-2015 17:59:53] SERVICE ALERT: webserver;PING;CRITICAL;SOFT;1;PING CRITICAL - Packet loss = 100%
Service Critical[06-23-2015 17:59:49] SERVICE ALERT: webserver;HTTPS;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds
Host Down[06-23-2015 17:58:48] HOST ALERT: webserver;DOWN;SOFT;1;(Host check timed out after 30.02 seconds)
Service Ok[06-23-2015 17:29:48] SERVICE ALERT: webserver;PING;OK;SOFT;2;PING OK - Packet loss = 0%, RTA = 34.72 ms
Então, às 17:29 horas estava tudo bem.
17:58 horas até 18:05 horas foi perda de pacotes = 100% e tempo limite de soquete.
Minha pergunta é: por que eu não recebi uma notificação?
Alguns dias antes e hoje recebo notificações de "aviso" muito bem, mas nunca recebo uma notificação "crítica".
Aqui está o meu contact.cfg
define contact{
contact_name nagiosadmin ; Short name of user
use generic-contact ; Inherit default values from generic-contact template (defined above)
alias Nagios Admin ; Full name of user
email user@localhost ; <<***** CHANGE THIS TO YOUR EMAIL ADDRESS ******
}
Aqui está meu templates.cfg
define contact{
name generic-contact ; The name of this contact template
service_notification_period 24x7 ; service notifications can be sent anytime
host_notification_period 24x7 ; host notifications can be sent anytime
service_notification_options w,u,c,r,f,s ; send notifications for all service states, flapping events, and scheduled downtime events
host_notification_options d,u,r,f,s ; send notifications for all host states, flapping events, and scheduled downtime events
service_notification_commands notify-service-by-email ; send service notifications via email
host_notification_commands notify-host-by-email ; send host notifications via email
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL CONTACT, JUST A TEMPLATE!
}