openvpn网络故障可能是什么原因造成的?

openvpn网络故障可能是什么原因造成的?

我们有大约 1200 个客户端连接的 openvpn 网络。基本上一切都运行良好,但每天有 0-4 次崩溃(随机发生),当我们丢失大多数/所有连接时。有一个 cron 作业每分钟检查一次有多少个客户端连接:

echo "status 3" | /bin/nc 127.0.0.1 5001 -q 1 | /bin/grep CLIENT_LIST | /bin/grep 10.10. | /usr/bin/wc -l

一天的结果是:

Mon Nov 23 23:24:02 EET 2015     1201
Mon Nov 23 23:25:02 EET 2015     312
Mon Nov 23 23:26:02 EET 2015     1201


Tue Nov 24 02:46:02 EET 2015     1196
Tue Nov 24 02:47:02 EET 2015     0
Tue Nov 24 02:48:02 EET 2015     1198


Tue Nov 24 05:45:02 EET 2015     1197
Tue Nov 24 05:46:02 EET 2015     324
Tue Nov 24 05:47:02 EET 2015     1196


Tue Nov 24 05:55:02 EET 2015     1199
Tue Nov 24 05:56:04 EET 2015     0
Tue Nov 24 05:57:02 EET 2015     35
Tue Nov 24 05:58:02 EET 2015     208
Tue Nov 24 05:59:02 EET 2015     369
Tue Nov 24 06:00:02 EET 2015     517
Tue Nov 24 06:01:02 EET 2015     636
Tue Nov 24 06:02:02 EET 2015     739
Tue Nov 24 06:03:02 EET 2015     845
Tue Nov 24 06:04:02 EET 2015     945
Tue Nov 24 06:05:02 EET 2015     1042
Tue Nov 24 06:06:02 EET 2015     1121
Tue Nov 24 06:07:02 EET 2015     1141

当发生“停电”时,日志显示:

Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 11.111.111.111:59208 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 11.111.111.111:59208 TLS Error: TLS handshake failed
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 11.111.111.111:59208 SIGUSR1[soft,tls-error] received, client-instance restarting
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:02  ovpn-openvpn[12639]: last message repeated 3 times
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: MULTI: multi_create_instance called
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Re-using SSL/TLS context
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 LZO compression initialized
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Control Channel MTU parms [ L:1542 D:166 EF:66 EB:0 ET:0 EL:0 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Data Channel MTU parms [ L:1542 D:1450 EF:42 EB:135 ET:0 EL:0 AF:3/1 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Local Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 0,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-server'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Expected Remote Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 1,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-client'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Local Options hash (VER=V4): '14168603'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 Expected Remote Options hash (VER=V4): '504e774e'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 222.222.222.222:34914 TLS: Initial packet from [AF_INET]222.222.222.222:34914, sid=e60171c4 e7222269
Nov 24 05:56:02 xxxxxx1 kernel: [45070269.509652] IN=tun0 OUT=eth1 MAC= SRC=10.10.143.155 DST=10.1.1.11 LEN=40 TOS=0x00 PREC=0x00 TTL=63 ID=50374 DF PROTO=TCP SPT=502 DPT=54773 WINDOW=7300 RES=0x00 ACK RST URGP=0 
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: MULTI: multi_create_instance called
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Re-using SSL/TLS context
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 LZO compression initialized
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Control Channel MTU parms [ L:1542 D:166 EF:66 EB:0 ET:0 EL:0 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Data Channel MTU parms [ L:1542 D:1450 EF:42 EB:135 ET:0 EL:0 AF:3/1 ]
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Local Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 0,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-server'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Expected Remote Options String: 'V4,dev-type tun,link-mtu 1542,tun-mtu 1500,proto UDPv4,comp-lzo,keydir 1,cipher BF-CBC,auth SHA1,keysize 128,tls-auth,key-method 2,tls-client'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Local Options hash (VER=V4): '14168603'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 Expected Remote Options hash (VER=V4): '504e774e'
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:47624 TLS: Initial packet from [AF_INET]44.444.444.44:47624, sid=f2f58219 caddd889
Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:03 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)
Nov 24 05:56:03 xxxxxx1 ovpn-openvpn[12639]: 1000000713/333.333.33.333:58655 MULTI: Learn: 10.200.6.150 -> 1000000713/333.333.33.333:58655
Nov 24 05:56:03 xxxxxx1 ovpn-openvpn[12639]: MANAGEMENT: Client disconnected
Nov 24 05:56:04 xxxxxx1 ovpn-openvpn[19031]: Current Parameter Settings:
Nov 24 05:56:04 xxxxxx1 ovpn-openvpn[19031]:   config = '/etc/openvpn/openvpn.conf'
Nov 24 05:56:04 xxxxxx1 ovpn-openvpn[19031]:   mode = 1
.
.
.
spits out server.conf file

要明确的是:

Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:59208 TLS Error: TLS key negotiation failed to occur within 60 seconds (check your network connectivity)
    Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:59208 TLS Error: TLS handshake failed
    Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: 44.444.444.44:59208 SIGUSR1[soft,tls-error] received, client-instance restarting
    Nov 24 05:56:02 xxxxxx1 ovpn-openvpn[12639]: read UDPv4 [ECONNREFUSED]: Connection refused (code=111)

..因为证书有效起始时间同步不良而存在。(电力故障后,客户端将其时钟重置为超出证书有效范围的默认值)——大概不是主要问题的原因?

  • 当看门狗检测到零连接后,openvpn 服务被强制重启:

    /usr/bin/killall -9 openvpn

    /bin/sh /etc/init.d/openvpn 重启

  • 我们一直以 1 秒的分辨率记录网络接口,并且没有任何故障

  • 我们一直在记录 iptables,在丢弃的数据包中没有任何内容可以提示 openvpn 连接

  • openvpn 重启后,所有客户端均正常重新连接,并且一切运行正常,直到下一次崩溃。

  • 我们的客户端大多是带有 dd-wrt ​​的路由器或带有 openvpn 的嵌入式 Linux。

  • 服务器.conf:

    local xxx.xx.xxx.xxx
    port 1194
    ;proto tcp
    proto udp
    ;dev tap
    dev tun
    ;dev-node MyTap
    ca /etc/openvpn/easy-rsa/keys/ca.crt
    cert /etc/openvpn/easy-rsa/keys/server.crt
    key /etc/openvpn/easy-rsa/keys/server.key
    dh /etc/openvpn/easy-rsa/keys/dh1024.pem
    mode server
    ifconfig 10.10.128.1 10.10.128.2
    ifconfig-pool 10.10.128.4 10.10.255.255
    route 10.10.128.0 255.255.128.0
    route 10.200.0.0 255.255.0.0
    push "route 10.200.0.0 255.255.0.0"
    push "route 10.10.128.0 255.255.128.0"
    push "route 10.1.1.0 255.255.255.0"
    client-config-dir /etc/openvpn/ccd
    keepalive 7 50
    tls-auth /etc/openvpn/easy-rsa/keys/ta.key 0 # This file is secret
    tls-server
    comp-lzo no
    verb 5
    topology p2p
    management localhost 5001
    crl-verify /etc/openvpn/crl.pem
    script-security 2
    client-disconnect "/usr/bin/php /root/cron/connect_disconnect.php disconnect"
    client-connect "/usr/bin/php /root/cron/connect_disconnect.php connect"
    

请问,您对如何记录/跟踪/寻找此类行为的可能原因有什么建议吗?

谢谢提前

  • 贾尼

相关内容