Temos a rede openvpn para aproximadamente 1200 conexões de clientes. Basicamente, tudo está funcionando bem, mas existe 0-4 colapsos por dia (que ocorre aleatoriamente), quando perdemos a maioria das conexões. Existe um cron job que verifica uma vez por minuto quantas conexões de clientes existem:
echo "status 3" | /bin/nc 127.0.0.1 5001 -q 1 | /bin/grep CLIENT_LIST | /bin/grep 10.10. | /usr/bin/wc -l
um dia resultado é:
Mon Nov 23 23:24:02 EET 2015 1201
Mon Nov 23 23:25:02 EET 2015 312
Mon Nov 23 23:26:02 EET 2015 1201
Tue Nov 24 02:46:02 EET 2015 1196
Tue Nov 24 02:47:02 EET 2015 0
Tue Nov 24 02:48:02 EET 2015 1198
Tue Nov 24 05:45:02 EET 2015 1197
Tue Nov 24 05:46:02 EET 2015 324
Tue Nov 24 05:47:02 EET 2015 1196
Tue Nov 24 05:55:02 EET 2015 1199
Tue Nov 24 05:56:04 EET 2015 0
Tue Nov 24 05:57:02 EET 2015 35
Tue Nov 24 05:58:02 EET 2015 208
Tue Nov 24 05:59:02 EET 2015 369
Tue Nov 24 06:00:02 EET 2015 517
Tue Nov 24 06:01:02 EET 2015 636
Tue Nov 24 06:02:02 EET 2015 739
Tue Nov 24 06:03:02 EET 2015 845
Tue Nov 24 06:04:02 EET 2015 945
Tue Nov 24 06:05:02 EET 2015 1042
Tue Nov 24 06:06:02 EET 2015 1121
Tue Nov 24 06:07:02 EET 2015 1141
Quando ocorre "black out", o log diz:
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 11.111.111.111:59208 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 11.111.111.111:59208 TLS Error: TLS handshake failed
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 11.111.111.111:59208 SIGUSR1[soft,tls-error] received, client-instance restarting
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:02 ovpn-openvpn[12639]: last message repeated 3 times
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: MULTI: multi_create_instance called
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Re-using SSL/TLS context
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 LZO compression initialized
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Control Channel MTU parms [ L:1542 D:166 EF:66 EB:0 ET:0 EL:0 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Data Channel MTU parms [ L:1542 D:1450 EF:42 EB:135 ET:0 EL:0 AF:3/1 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Local Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 0,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-server'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Expected Remote Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 1,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-client'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Local Options hash (VER=V4): '14168603'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Expected Remote Options hash (VER=V4): '504e774e'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 TLS: Initial packet from [AF_INET]222.222.222.222:34914, sid=e60171c4 e7222269
Nov 24 05:56:02 xxxxxx1 kernel: [45070269.509652] IN=tun0 OUT=eth1 MAC= SRC=10.10.143.155 DST=10.1.1.11 LEN=40 TOS=0x00 PREC=0x00 TTL=63 ID=50374 DF PROTO=TCP SPT=502 DPT=54773 WINDOW=7300 RES=0x00 ACK RST URGP=0
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: MULTI: multi_create_instance called
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Re-using SSL/TLS context
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 LZO compression initialized
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Control Channel MTU parms [ L:1542 D:166 EF:66 EB:0 ET:0 EL:0 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Data Channel MTU parms [ L:1542 D:1450 EF:42 EB:135 ET:0 EL:0 AF:3/1 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Local Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 0,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-server'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Expected Remote Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 1,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-client'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Local Options hash (VER=V4): '14168603'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Expected Remote Options hash (VER=V4): '504e774e'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 TLS: Initial packet from [AF_INET]44.444.444.44:47624, sid=f2f58219 caddd889
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:03 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:03 xxxxxx1 ovpn-openvpn[12639]: 1000000713/333.333.33.333:58655 MULTI: Learn: 10.200.6.150 -> 1000000713/333.333.33.333:58655
Nov 24 05:56:03 xxxxxx1 ovpn-openvpn[12639]: MANAGEMENT: Client disconnected
Nov 24 05:56:04 xxxxxx1 ovpn-openvpn[19031]: Current Parameter Settings:
Nov 24 05:56:04 xxxxxx1 ovpn-openvpn[19031]: config = '/etc/openvpn/openvpn.conf'
Nov 24 05:56:04 xxxxxx1 ovpn-openvpn[19031]: mode = 1
.
.
.
spits out server.conf file
Para ficar claro:
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:59208 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:59208 TLS Error: TLS handshake failed
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:59208 SIGUSR1[soft,tls-error] received, client-instance restarting
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
.. existe por causa de startTimes válidos do certificado mal sincronizado. (Depois de clientes de falha elétrica redefinir seus relógios para os valores padrão que estão fora de intervalo válido de certificados) -presumivelmente não a causa do problema principal?
Após o watchdog detectar conexões zero, o serviço openvpn é forçado a reiniciar:
/ usr / bin / killall -9 openvpn
/ bin / sh /etc/init.d/openvpn restart
Estamos registrando a interface de rede em uma resolução de 1 segundo e não há falhas
Estamos registrando o iptables e dentro de pacotes descartados não há nada sugerido para conexões openvpn
Após a reinicialização do openvpn, todos os clientes são reconectados normalmente e tudo está funcionando bem até o próximo colapso.
Principalmente nossos clientes são roteadores com dd-wrt ou linux embutido com openvpn.
Server.conf:
local xxx.xx.xxx.xxx
port 1194
;proto tcp
proto udp
;dev tap
dev tun
;dev-node MyTap
ca /etc/openvpn/easy-rsa/keys/ca.crt
cert /etc/openvpn/easy-rsa/keys/server.crt
key /etc/openvpn/easy-rsa/keys/server.key
dh /etc/openvpn/easy-rsa/keys/dh1024.pem
mode server
ifconfig 10.10.128.1 10.10.128.2
ifconfig-pool 10.10.128.4 10.10.255.255
route 10.10.128.0 255.255.128.0
route 10.200.0.0 255.255.0.0
push "route 10.200.0.0 255.255.0.0"
push "route 10.10.128.0 255.255.128.0"
push "route 10.1.1.0 255.255.255.0"
client-config-dir /etc/openvpn/ccd
keepalive 7 50
tls-auth /etc/openvpn/easy-rsa/keys/ta.key 0 # This file is secret
tls-server
comp-lzo no
verb 5
topology p2p
management localhost 5001
crl-verify /etc/openvpn/crl.pem
script-security 2
client-disconnect "/usr/bin/php /root/cron/connect_disconnect.php disconnect"
client-connect "/usr/bin/php /root/cron/connect_disconnect.php connect"
Por favor, você teria alguma sugestão sobre como registrar / rastrear / procurar possíveis causas desse tipo de comportamento?
Obrigado pelo avanço
Tags networking openvpn iptables linux