我们有大约 1200 个客户端连接的 openvpn 网络。基本上一切都运行良好,但每天有 0-4 次崩溃(随机发生),当我们丢失大多数/所有连接时。有一个 cron 作业每分钟检查一次有多少个客户端连接:
echo "status 3" | /bin/nc 127.0.0.1 5001 -q 1 | /bin/grep CLIENT_LIST | /bin/grep 10.10. | /usr/bin/wc -l
一天的结果是:
Mon Nov 23 23:24:02 EET 2015 1201
Mon Nov 23 23:25:02 EET 2015 312
Mon Nov 23 23:26:02 EET 2015 1201
Tue Nov 24 02:46:02 EET 2015 1196
Tue Nov 24 02:47:02 EET 2015 0
Tue Nov 24 02:48:02 EET 2015 1198
Tue Nov 24 05:45:02 EET 2015 1197
Tue Nov 24 05:46:02 EET 2015 324
Tue Nov 24 05:47:02 EET 2015 1196
Tue Nov 24 05:55:02 EET 2015 1199
Tue Nov 24 05:56:04 EET 2015 0
Tue Nov 24 05:57:02 EET 2015 35
Tue Nov 24 05:58:02 EET 2015 208
Tue Nov 24 05:59:02 EET 2015 369
Tue Nov 24 06:00:02 EET 2015 517
Tue Nov 24 06:01:02 EET 2015 636
Tue Nov 24 06:02:02 EET 2015 739
Tue Nov 24 06:03:02 EET 2015 845
Tue Nov 24 06:04:02 EET 2015 945
Tue Nov 24 06:05:02 EET 2015 1042
Tue Nov 24 06:06:02 EET 2015 1121
Tue Nov 24 06:07:02 EET 2015 1141
当发生“停电”时,日志显示:
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 11.111.111.111:59208 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 11.111.111.111:59208 TLS Error: TLS handshake failed
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 11.111.111.111:59208 SIGUSR1[soft,tls-error] received, client-instance restarting
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:02 ovpn-openvpn[12639]: last message repeated 3 times
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: MULTI: multi_create_instance called
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Re-using SSL/TLS context
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 LZO compression initialized
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Control Channel MTU parms [ L:1542 D:166 EF:66 EB:0 ET:0 EL:0 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Data Channel MTU parms [ L:1542 D:1450 EF:42 EB:135 ET:0 EL:0 AF:3/1 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Local Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 0,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-server'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Expected Remote Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 1,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-client'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Local Options hash (VER=V4): '14168603'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Expected Remote Options hash (VER=V4): '504e774e'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 TLS: Initial packet from [AF_INET]222.222.222.222:34914, sid=e60171c4 e7222269
Nov 24 05:56:02 xxxxxx1 kernel: [45070269.509652] IN=tun0 OUT=eth1 MAC= SRC=10.10.143.155 DST=10.1.1.11 LEN=40 TOS=0x00 PREC=0x00 TTL=63 ID=50374 DF PROTO=TCP SPT=502 DPT=54773 WINDOW=7300 RES=0x00 ACK RST URGP=0
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: MULTI: multi_create_instance called
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Re-using SSL/TLS context
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 LZO compression initialized
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Control Channel MTU parms [ L:1542 D:166 EF:66 EB:0 ET:0 EL:0 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Data Channel MTU parms [ L:1542 D:1450 EF:42 EB:135 ET:0 EL:0 AF:3/1 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Local Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 0,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-server'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Expected Remote Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 1,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-client'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Local Options hash (VER=V4): '14168603'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Expected Remote Options hash (VER=V4): '504e774e'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 TLS: Initial packet from [AF_INET]44.444.444.44:47624, sid=f2f58219 caddd889
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:03 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:03 xxxxxx1 ovpn-openvpn[12639]: 1000000713/333.333.33.333:58655 MULTI: Learn: 10.200.6.150 -> 1000000713/333.333.33.333:58655
Nov 24 05:56:03 xxxxxx1 ovpn-openvpn[12639]: MANAGEMENT: Client disconnected
Nov 24 05:56:04 xxxxxx1 ovpn-openvpn[19031]: Current Parameter Settings:
Nov 24 05:56:04 xxxxxx1 ovpn-openvpn[19031]: config = '/etc/openvpn/openvpn.conf'
Nov 24 05:56:04 xxxxxx1 ovpn-openvpn[19031]: mode = 1
.
.
.
spits out server.conf file
要明确的是:
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:59208 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:59208 TLS Error: TLS handshake failed
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:59208 SIGUSR1[soft,tls-error] received, client-instance restarting
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
..因为证书有效起始时间同步不良而存在。(电力故障后,客户端将其时钟重置为超出证书有效范围的默认值)——大概不是主要问题的原因?
当看门狗检测到零连接后,openvpn 服务被强制重启:
/usr/bin/killall -9 openvpn
/bin/sh /etc/init.d/openvpn 重启
我们一直以 1 秒的分辨率记录网络接口,并且没有任何故障
我们一直在记录 iptables,在丢弃的数据包中没有任何内容可以提示 openvpn 连接
openvpn 重启后,所有客户端均正常重新连接,并且一切运行正常,直到下一次崩溃。
我们的客户端大多是带有 dd-wrt 的路由器或带有 openvpn 的嵌入式 Linux。
服务器.conf:
local xxx.xx.xxx.xxx port 1194 ;proto tcp proto udp ;dev tap dev tun ;dev-node MyTap ca /etc/openvpn/easy-rsa/keys/ca.crt cert /etc/openvpn/easy-rsa/keys/server.crt key /etc/openvpn/easy-rsa/keys/server.key dh /etc/openvpn/easy-rsa/keys/dh1024.pem mode server ifconfig 10.10.128.1 10.10.128.2 ifconfig-pool 10.10.128.4 10.10.255.255 route 10.10.128.0 255.255.128.0 route 10.200.0.0 255.255.0.0 push "route 10.200.0.0 255.255.0.0" push "route 10.10.128.0 255.255.128.0" push "route 10.1.1.0 255.255.255.0" client-config-dir /etc/openvpn/ccd keepalive 7 50 tls-auth /etc/openvpn/easy-rsa/keys/ta.key 0 # This file is secret tls-server comp-lzo no verb 5 topology p2p management localhost 5001 crl-verify /etc/openvpn/crl.pem script-security 2 client-disconnect "/usr/bin/php /root/cron/connect_disconnect.php disconnect" client-connect "/usr/bin/php /root/cron/connect_disconnect.php connect"
请问,您对如何记录/跟踪/寻找此类行为的可能原因有什么建议吗?
谢谢提前
- 贾尼