Linux下如何知道网卡是否出现故障?

Linux下如何知道网卡是否出现故障?

在一台大量用于文件下载的服务器上,每隔几个小时服务器就会停止响应 ssh、http 和 ping 请求。服务器重启后就会恢复正常。

提供商技术人员猜测这可能是由于网络故障造成的。我想知道如何调查并解决这个问题?

这是 dmesg 日志中的最新日志。在过去 24 小时内,服务器已重启两次。

[    7.266682] ioatdma 0000:00:16.0: setting latency timer to 64
[    7.266726]   alloc irq_desc for 65 on node -1
[    7.266728]   alloc kstat_irqs on node -1
[    7.266731] alloc irq_2_iommu on node -1
[    7.266736] ioatdma 0000:00:16.0: irq 65 for MSI/MSI-X
[    7.266879] ioatdma 0000:00:16.1: enabling device (0000 -> 0002)
[    7.266882]   alloc irq_desc for 44 on node -1
[    7.266883]   alloc kstat_irqs on node -1
[    7.266886] alloc irq_2_iommu on node -1
[    7.266891] ioatdma 0000:00:16.1: PCI INT B -> GSI 44 (level, low) -> IRQ 44
[    7.266902] ioatdma 0000:00:16.1: setting latency timer to 64
[    7.266936]   alloc irq_desc for 66 on node -1
[    7.266938]   alloc kstat_irqs on node -1
[    7.266940] alloc irq_2_iommu on node -1
[    7.266944] ioatdma 0000:00:16.1: irq 66 for MSI/MSI-X
[    7.267097] ioatdma 0000:00:16.2: enabling device (0000 -> 0002)
[    7.267101]   alloc irq_desc for 45 on node -1
[    7.267103]   alloc kstat_irqs on node -1
[    7.267107] alloc irq_2_iommu on node -1
[    7.267113] ioatdma 0000:00:16.2: PCI INT C -> GSI 45 (level, low) -> IRQ 45
[    7.267126] ioatdma 0000:00:16.2: setting latency timer to 64
[    7.267162]   alloc irq_desc for 67 on node -1
[    7.267163]   alloc kstat_irqs on node -1
[    7.267165] alloc irq_2_iommu on node -1
[    7.267170] ioatdma 0000:00:16.2: irq 67 for MSI/MSI-X
[    7.267307] ioatdma 0000:00:16.3: enabling device (0000 -> 0002)
[    7.267312]   alloc irq_desc for 46 on node -1
[    7.267314]   alloc kstat_irqs on node -1
[    7.267317] alloc irq_2_iommu on node -1
[    7.267324] ioatdma 0000:00:16.3: PCI INT D -> GSI 46 (level, low) -> IRQ 46
[    7.267339] ioatdma 0000:00:16.3: setting latency timer to 64
[    7.267383]   alloc irq_desc for 68 on node -1
[    7.267386]   alloc kstat_irqs on node -1
[    7.267389] alloc irq_2_iommu on node -1
[    7.267395] ioatdma 0000:00:16.3: irq 68 for MSI/MSI-X
[    7.267527] ioatdma 0000:00:16.4: enabling device (0000 -> 0002)
[    7.267531] ioatdma 0000:00:16.4: PCI INT A -> GSI 43 (level, low) -> IRQ 43
[    7.267543] ioatdma 0000:00:16.4: setting latency timer to 64
[    7.267587]   alloc irq_desc for 69 on node -1
[    7.267589]   alloc kstat_irqs on node -1
[    7.267593] alloc irq_2_iommu on node -1
[    7.267599] ioatdma 0000:00:16.4: irq 69 for MSI/MSI-X
[    7.267743] ioatdma 0000:00:16.5: enabling device (0000 -> 0002)
[    7.267746] ioatdma 0000:00:16.5: PCI INT B -> GSI 44 (level, low) -> IRQ 44
[    7.267759] ioatdma 0000:00:16.5: setting latency timer to 64
[    7.267794]   alloc irq_desc for 70 on node -1
[    7.267796]   alloc kstat_irqs on node -1
[    7.267798] alloc irq_2_iommu on node -1
[    7.267803] ioatdma 0000:00:16.5: irq 70 for MSI/MSI-X
[    7.267950] ioatdma 0000:00:16.6: enabling device (0000 -> 0002)
[    7.267955] ioatdma 0000:00:16.6: PCI INT C -> GSI 45 (level, low) -> IRQ 45
[    7.267970] ioatdma 0000:00:16.6: setting latency timer to 64
[    7.268012]   alloc irq_desc for 71 on node -1
[    7.268013]   alloc kstat_irqs on node -1
[    7.268016] alloc irq_2_iommu on node -1
[    7.268021] ioatdma 0000:00:16.6: irq 71 for MSI/MSI-X
[    7.268152] ioatdma 0000:00:16.7: enabling device (0000 -> 0002)
[    7.268157] ioatdma 0000:00:16.7: PCI INT D -> GSI 46 (level, low) -> IRQ 46
[    7.268173] ioatdma 0000:00:16.7: setting latency timer to 64
[    7.268217]   alloc irq_desc for 72 on node -1
[    7.268219]   alloc kstat_irqs on node -1
[    7.268222] alloc irq_2_iommu on node -1
[    7.268228] ioatdma 0000:00:16.7: irq 72 for MSI/MSI-X
[    7.273295] i801_smbus 0000:00:1f.3: PCI INT C -> GSI 18 (level, low) -> IRQ 18
[    7.277431] Monitor-Mwait will be used to enter C-1 state
[    7.277533] Monitor-Mwait will be used to enter C-2 state
[    7.278051] Monitor-Mwait will be used to enter C-3 state
[    7.278131] processor LNXCPU:00: registered as cooling_device0
[    7.278197] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input7
[    7.278226] ACPI: Power Button [PWRF]
[    7.278892] processor LNXCPU:01: registered as cooling_device1
[    7.279463] processor LNXCPU:02: registered as cooling_device2
[    7.280028] processor LNXCPU:03: registered as cooling_device3
[    7.280564] processor LNXCPU:04: registered as cooling_device4
[    7.283535] processor LNXCPU:05: registered as cooling_device5
[    7.284159] processor LNXCPU:06: registered as cooling_device6
[    7.284768] processor LNXCPU:07: registered as cooling_device7
[    7.285364] processor LNXCPU:08: registered as cooling_device8
[    7.285879] processor LNXCPU:09: registered as cooling_device9
[    7.286595] processor LNXCPU:0a: registered as cooling_device10
[    7.287125] processor LNXCPU:0b: registered as cooling_device11
[    7.287720] processor LNXCPU:0c: registered as cooling_device12
[    7.288295] processor LNXCPU:0d: registered as cooling_device13
[    7.288825] processor LNXCPU:0e: registered as cooling_device14
[    7.289485] processor LNXCPU:0f: registered as cooling_device15
[    7.290069] processor LNXCPU:10: registered as cooling_device16
[    7.290675] processor LNXCPU:11: registered as cooling_device17
[    7.296242] Error: Driver 'pcspkr' is already registered, aborting...
[    7.299964] processor LNXCPU:12: registered as cooling_device18
[    7.300702] processor LNXCPU:13: registered as cooling_device19
[    7.301409] processor LNXCPU:14: registered as cooling_device20
[    7.302091] processor LNXCPU:15: registered as cooling_device21
[    7.302741] processor LNXCPU:16: registered as cooling_device22
[    7.303410] processor LNXCPU:17: registered as cooling_device23
[    7.447430] Adding 8787960k swap on /dev/md1.  Priority:-1 extents:1 across:8787960k 
[    7.502237] loop: module loaded
[    7.660050] EXT4-fs (sdd1): mounted filesystem with ordered data mode
[    7.668827] EXT4-fs (sda3): mounted filesystem with ordered data mode
[    7.669375] EXT4-fs (sdc): Unrecognized mount option "0" or missing value
[    7.824669] ADDRCONF(NETDEV_UP): eth0: link is not ready

答案1

可能值得使用ethtool统计信息检查网络设备是否存在任何 NIC 和驱动程序错误:

ethtool -S "ethX"

只需用你的 NIC 替换即可ethX

您还可以使用该参数测试网络适配器-t,尽管这可能会中断连接。

抱歉——这应该是一条评论,但我还不允许评论。

相关内容