我们的虚拟机定期会发生一些奇怪的事情。
某些时候无法向服务发送请求。即使 consul 或 redis 正在运行,也会收到超时错误:
consul members
Error connecting to Consul agent: dial tcp 127.0.0.1:8400: getsockopt: connection timed out
和
/usr/local/src/redis-3.0.7/src/redis-cli -p 6379
Could not connect to Redis at 127.0.0.1:6379: Connection timed out
但如果我重新启动防火墙,apf -r
它就会起作用。
在重新启动 apf 之前,我已经将 iptables 规则保存到文件中,发现了一些奇怪的规则:
Chain RESET (0 references)
pkts bytes target prot opt in out source destination
0 0 REJECT tcp -- * * 0.0.0.0/0 0.0.0.0/0 reject-with tcp-reset
Chain PROHIBIT (0 references)
pkts bytes target prot opt in out source destination
0 0 REJECT all -- * * 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited
100% 确定我没有添加此规则。我在日志中发现的唯一一件事:
Jul 15 06:25:01 journey-test python3[1090]: 2018/07/15 06:25:01.935420 INFO Successfully added Azure fabric firewall rules
Jul 15 06:25:01 journey-test python3[1090]: 2018/07/15 06:25:01.963119 INFO Firewall rules:
Jul 15 06:25:01 journey-test python3[1090]: Chain INPUT (policy ACCEPT 0 packets, 0 bytes)
Jul 15 06:25:01 journey-test python3[1090]: pkts bytes target prot opt in out source destination
Jul 15 06:25:01 journey-test python3[1090]: Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
Jul 15 06:25:01 journey-test python3[1090]: pkts bytes target prot opt in out source destination
Jul 15 06:25:01 journey-test python3[1090]: Chain OUTPUT (policy ACCEPT 0 packets, 0 bytes)
Jul 15 06:25:01 journey-test python3[1090]: pkts bytes target prot opt in out source destination
Jul 15 06:25:01 journey-test python3[1090]: 0 0 ACCEPT tcp -- * * 0.0.0.0/0 168.63.129.16 owner UID match 0
Jul 15 06:25:01 journey-test python3[1090]: 0 0 ACCEPT tcp -- * * 0.0.0.0/0 168.63.129.16 ctstate INVALID,NEW
有什么建议吗?我如何找到谁添加了此REJECT
规则?
更新:在/etc/waagent.conf我发现OS.EnableFirewall=y
。在我们的旧虚拟机上没有这样的规则。所有新部署的虚拟机都有此规则。因此,Azure 已更改此代理的默认行为。现在它可以更改防火墙规则。而且,我认为,这是我们问题的根源。