Debian 服务器网络连接不断断开

Debian 服务器网络连接不断断开

背景

我正在运行 Debian 的内部服务器上工作

Uname - r输出 :

Linux osteocalcine 5.10.0-13-amd64 #1 SMP Debian 5.10.106-1 (2022-03-17) x86_64 GNU/Linux

该服务器(计划用作共享文档存储库等)根据其 MAC 地址从本地 dhcp 服务器接收固定 ip 地址。

以下是相关部分etc/network/interfaces

auto lo
iface lo inet loopback


auto enp11s0f0
iface enp11s0f0 inet dhcp

该服务器严格是内部的,外部无法与其通信(即,除非我在同一个网络上,否则我无法连接到它,因此家里没有 SSH,只有办公室)。

该设置是一个基本的 GUI(我设想将来我的同事可能想用它来运行各种分析,我相信他们会更乐意远程登录到桌面,而不是命令行)。

问题

服务器会定期断开网络,并且不会自动重新连接。

最后一次执行此操作时,我获取了 dmesg 的输出

[Mon May  9 12:08:25 2022] audit: type=1400 audit(1652090911.500:10): apparmor="
STATUS" operation="profile_load" profile="unconfined" name="libreoffice-senddoc"
 pid=585 comm="apparmor_parser"
[Mon May  9 12:08:25 2022] audit: type=1400 audit(1652090911.500:11): apparmor="
STATUS" operation="profile_load" profile="unconfined" name="libreoffice-oopslash
" pid=595 comm="apparmor_parser"
[Mon May  9 12:08:25 2022] pstore: Using crash dump compression: deflate
[Mon May  9 12:08:25 2022] pstore: Registered efi as persistent store backend
[Mon May  9 12:08:25 2022] bnx2 0000:0b:00.0: firmware: direct-loading firmware 
bnx2/bnx2-mips-09-6.2.1b.fw
[Mon May  9 12:08:25 2022] bnx2 0000:0b:00.0: firmware: direct-loading firmware 
bnx2/bnx2-rv2p-09-6.0.17.fw
[Mon May  9 12:08:25 2022] bnx2 0000:0b:00.0 enp11s0f0: using MSIX
[Mon May  9 12:08:27 2022] bnx2 0000:0b:00.0 enp11s0f0: NIC Copper Link is Up, 1
00 Mbps full duplex

[Mon May  9 12:08:27 2022] IPv6: ADDRCONF(NETDEV_CHANGE): enp11s0f0: link become
s ready
[Mon May  9 12:08:29 2022] bnx2 0000:0b:00.1 enp11s0f1: using MSIX
[Mon May  9 12:08:29 2022] bnx2 0000:15:00.0 ens2f0: using MSIX
[Mon May  9 12:08:29 2022] bnx2 0000:15:00.1 ens2f1: using MSIX
[Mon May  9 12:09:02 2022] kauditd_printk_skb: 10 callbacks suppressed
[Mon May  9 12:09:02 2022] audit: type=1400 audit(1652090942.775:22): apparmor="
DENIED" operation="capable" profile="/usr/sbin/cupsd" pid=991 comm="cupsd" capab
ility=12  capname="net_admin"
[Mon May  9 12:09:03 2022] audit: type=1400 audit(1652090943.315:23): apparmor="
DENIED" operation="capable" profile="/usr/sbin/cups-browsed" pid=1081 comm="cups
-browsed" capability=23  capname="sys_nice"
[Mon May  9 13:24:56 2022] perf: interrupt took too long (2519 > 2500), lowering
 kernel.perf_event_max_sample_rate to 79250
[Mon May  9 14:52:01 2022] perf: interrupt took too long (3161 > 3148), lowering
 kernel.perf_event_max_sample_rate to 63250
[Tue May 10 00:00:44 2022] audit: type=1400 audit(1652133644.687:24): apparmor="
DENIED" operation="capable" profile="/usr/sbin/cupsd" pid=77551 comm="cupsd" cap
ability=12  capname="net_admin"
[Tue May 10 00:00:44 2022] audit: type=1400 audit(1652133644.819:25): apparmor="
DENIED" operation="capable" profile="/usr/sbin/cups-browsed" pid=77552 comm="cup
s-browsed" capability=23  capname="sys_nice"

此时我重新启动了服务器(这似乎是重新启动网络的唯一方法),以下是新 dmesg 中看似相关的行......

[Tue May 10 14:00:01 2022] bnx2 0000:0b:00.0 eth0: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem 96000000, IRQ 24, node addr 5c:f3:fc:e4:6f:d8
[Tue May 10 14:00:01 2022] bnx2 0000:0b:00.1 eth1: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem 98000000, IRQ 36, node addr 5c:f3:fc:e4:6f:da
[Tue May 10 14:00:01 2022] i801_smbus 0000:00:1f.3: enabling device (0140 -> 0143)
[Tue May 10 14:00:01 2022] i801_smbus 0000:00:1f.3: SMBus using PCI interrupt
[Tue May 10 14:00:01 2022] bnx2 0000:15:00.0 eth2: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem 92000000, IRQ 28, node addr 00:10:18:fb:1f:20
[Tue May 10 14:00:01 2022] ACPI Warning: SystemIO range 0x00000000000005A8-0x00000000000005AF conflicts with OpRegion 0x00000000000005A8-0x00000000000005AF (\_SB.PCI0.LPC0.GPE0) (20200925/utaddress-204)
[Tue May 10 14:00:01 2022] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[Tue May 10 14:00:01 2022] ACPI Warning: SystemIO range 0x0000000000000430-0x000000000000043F conflicts with OpRegion 0x0000000000000439-0x0000000000000439 (\_SB.PCI0.RIL) (20200925/utaddress-204)
[Tue May 10 14:00:01 2022] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[Tue May 10 14:00:01 2022] ACPI Warning: SystemIO range 0x0000000000000400-0x000000000000042F conflicts with OpRegion 0x000000000000040E-0x000000000000040E (\_SB.PCI0.RIT) (20200925/utaddress-204)
[Tue May 10 14:00:01 2022] ACPI Warning: SystemIO range 0x0000000000000400-0x000000000000042F conflicts with OpRegion 0x000000000000040C-0x000000000000040C (\_SB.PCI0.RTY) (20200925/utaddress-204)
[Tue May 10 14:00:01 2022] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[Tue May 10 14:00:01 2022] lpc_ich: Resource conflict(s) found affecting gpio_ich
[Tue May 10 14:00:01 2022] bnx2 0000:15:00.1 eth3: Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express found at mem 94000000, IRQ 37, node addr 00:10:18:fb:1f:22
[Tue May 10 14:00:01 2022] bnx2 0000:0b:00.0 enp11s0f0: renamed from eth0
[Tue May 10 14:00:13 2022] bnx2 0000:0b:00.0: firmware: direct-loading firmware bnx2/bnx2-mips-09-6.2.1b.fw
[Tue May 10 14:00:13 2022] bnx2 0000:0b:00.0: firmware: direct-loading firmware bnx2/bnx2-rv2p-09-6.0.17.fw
[Tue May 10 14:00:13 2022] bnx2 0000:0b:00.0 enp11s0f0: using MSIX
[Tue May 10 14:00:14 2022] bnx2 0000:0b:00.0 enp11s0f0: NIC Copper Link is Up, 100 Mbps full duplex
[Tue May 10 14:00:14 2022] IPv6: ADDRCONF(NETDEV_CHANGE): enp11s0f0: link becomes ready
[Tue May 10 14:00:15 2022] bnx2 0000:0b:00.1 enp11s0f1: using MSIX
[Tue May 10 14:00:16 2022] bnx2 0000:15:00.0 ens2f0: using MSIX
[Tue May 10 14:00:16 2022] bnx2 0000:15:00.1 ens2f1: using MSIX

之后的几行是针对星期三的,所以我想不相关(如果我错了,请告诉我)。

我不确定需要在 dmesg(或其他地方)中查找什么来确定导致连接断开的原因。

然而我注意到了以下情况。

  • 为了解决这个问题,我必须物理地重启服务器。
    • 使用“systelctl networking restart”重新启动网络没有任何作用。(我如何从中提取错误/跟踪消息)。
  • 当我通过 SSH 登录并且我的本地终端“进入睡眠状态”时,似乎出现了问题〜这会以某种方式导致服务器出现问题吗?

是否有一个守护进程需要我配置,它将继续测试连接,如果连接断开则“将其启动”(但请注意,systemctl 调用似乎无法使网络重新启动〜所以这可能是一个没有实际意义的观点)。

笔记 如上所述,我在服务器上安装了一个“桌面”,以防我的同事想要登录。我意识到系统上安装了网络管理器......

$ apt list --installed |grep network

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

glib-networking-common/stable,now 2.66.0-2 all [installed,automatic]
glib-networking-services/stable,now 2.66.0-2 amd64 [installed,automatic]
glib-networking/stable,now 2.66.0-2 amd64 [installed,automatic]
libqt5network5/stable,now 5.15.2+dfsg-9 amd64 [installed,automatic]
network-manager-gnome/stable,now 1.20.0-3 amd64 [installed,automatic]
network-manager/stable,now 1.30.0-2 amd64 [installed,automatic]

这可能会造成问题吗?

如果您需要更多详细信息,请询问,我会发布更新。

一如往常,提前感谢您的帮助。

编辑1:

所以我更倾向于认为问题出在 ssh 上。我今天登录后,连接“冻结”了。我不得不通过另一个终端登录,并终止第一个连接。我提到这件事是因为它发生在我通常与同事共进午餐/喝咖啡的时候。我现在需要做的是改进整个系统的监控……但我应该添加什么,我需要注意什么?

相关内容