Linux 中与戴尔刀片服务器等设备失去连接

2024-5-27 • tag-icon

因此，我们有一些戴尔刀片和底盘（刀片是 M600，底盘是 M1000）和其他系统（带 MD3000 阵列的 R710）。R710 通过 nfs 导出源树，供刀片构建和测试。

问题是刀片服务器丢失了 nfs 安装。位于同一机箱中的刀片服务器，看似配置相同，但连接挂起，甚至无法 ping 通服务器。它们最终恢复了。

事实上，主要是戴尔的问题，我们有一根电缆从 r710 连接到其中一个机箱中的交换机，另一根电缆连接到交换机，再从那里连接到机箱，这两者都可能存在问题。

我们正在运行 Centos5 或 Fedora Core 版本 5 (Bordeaux)。nfs 服务器正在运行 CentOS 版本 5.4 (Final)。

有什么想法吗？故障排除提示？

这些都是到同一主机，但是通过不同的路由：

通过开关

[root@b053 ~]# ping svnwatch-data
PING storage.rack1.rinera.int (10.1.1.54) 56(84) bytes of data.

--- storage.rack1.rinera.int ping statistics ---
9 packets transmitted, 0 received, 100% packet loss, time 7999ms

通过另一台主机路由：

[root@b053 ~]# ping svnwatch-data2
PING storage2.rack1.rinera.int (172.16.100.25) 56(84) bytes of data.
64 bytes from 172.16.100.25: icmp_seq=1 ttl=64 time=0.260 ms
64 bytes from 172.16.100.25: icmp_seq=2 ttl=64 time=0.217 ms
64 bytes from 172.16.100.25: icmp_seq=3 ttl=64 time=0.201 ms
64 bytes from 172.16.100.25: icmp_seq=4 ttl=64 time=0.264 ms

--- storage2.rack1.rinera.int ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2999ms
rtt min/avg/max/mdev = 0.201/0.235/0.264/0.031 ms

将主机连接到不同机箱的交换机（它们以菊花链形式连接）

[root@b053 ~]# ping svnwatch-data-eth2
PING svnwatch-data-eth2.rack1.rinera.int (10.1.1.56) 56(84) bytes of data.
64 bytes from 10.1.1.56: icmp_seq=1 ttl=64 time=0.598 ms
64 bytes from 10.1.1.56: icmp_seq=2 ttl=64 time=0.096 ms
64 bytes from 10.1.1.56: icmp_seq=3 ttl=64 time=0.168 ms

--- svnwatch-data-eth2.rack1.rinera.int ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.096/0.287/0.598/0.222 ms
[root@b053 ~]#

答案1

以下是我要检查的内容。

路由表：ip route show
路由缓存：ip route show cache
检查任何奇怪的 iptables 规则。iptables -t nat -L -n -v; iptables -L -n -v; iptables -t mangle -L -n -v
检查日志文件。
检查内核版本。
检查 sysctl/proc 设置，例如 rp_filter，这在路由/多接口配置中很重要
检查 ARP 表是否存在 IP 冲突等。
当然还有：tcpdump 和 tcpflow...

答案1

相关内容