设置
我终于决定就几周前出现的问题寻求帮助。我正在运行带有 Ubuntu Server 20.04 的无头 RPI 3B+,主要运行 Wireguard 服务器和几个轻量级 Docker 容器(Homebridge、Pi-hole、Portainer 等)。
我的调查
几周以来(没有确切的日期或导致问题的操作),Pi 会随机无法通过网络访问,直到硬重启。经过进一步调查,我可以报告以下内容:
- 任何时候都没有可能导致 Pi 崩溃的特定过载,功率水平始终良好,电源是全新的,并且肯定输出足够的功率。
- 当崩溃发生时,Pi 无法通过网络访问,但仍继续运行:屏幕上的控制台仍然可见,根据后来检索到的系统日志,Dockers 仍正在后台运行,活动 LED 偶尔会亮起。
- 屏幕上的控制台显示一条错误消息(附这里)
- Syslog 内容如下:
Dec 15 18:07:51 rpi kernel: [47182.438053] ------------[ cut here ]------------
Dec 15 18:07:51 rpi kernel: [47182.438136] NETDEV WATCHDOG: eth0 (lan78xx): transmit queue 0 timed out
Dec 15 18:07:51 rpi kernel: [47182.438270] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:447 dev_watchdog+0x370/0x378
Dec 15 18:07:51 rpi kernel: [47182.438276] Modules linked in: xt_nat xt_tcpudp veth xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter bridge stp llc iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter ip6_tables iptable_filter bpfilter wireguard ip6_udp_tunnel udp_tunnel aufs overlay dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua btsdio bluetooth ecdh_generic ecc brcmfmac brcmutil cfg80211 bcm2835_codec(CE) bcm2835_isp(CE) bcm2835_v4l2(CE) v4l2_mem2mem bcm2835_mmal_vchiq(CE) videobuf2_vmalloc videobuf2_dma_contig snd_bcm2835(CE) videobuf2_memops videobuf2_v4l2 snd_pcm raspberrypi_hwmon videobuf2_common snd_timer videodev snd mc vc_sm_cma(CE) uio_pdrv_genirq uio sch_fq_codel drm ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_ce spidev phy_generic aes_neon_bs aes_neon_blk crypto_simd cryptd
Dec 15 18:07:51 rpi kernel: [47182.438514] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G C E 5.4.0-1025-raspi #28-Ubuntu
Dec 15 18:07:51 rpi kernel: [47182.438521] Hardware name: Raspberry Pi 3 Model B Plus Rev 1.3 (DT)
Dec 15 18:07:51 rpi kernel: [47182.438529] pstate: 60400005 (nZCv daif +PAN -UAO)
Dec 15 18:07:51 rpi kernel: [47182.438539] pc : dev_watchdog+0x370/0x378
Dec 15 18:07:51 rpi kernel: [47182.438547] lr : dev_watchdog+0x370/0x378
Dec 15 18:07:51 rpi kernel: [47182.438554] sp : ffff80001000bd80
Dec 15 18:07:51 rpi kernel: [47182.438559] x29: ffff80001000bd80 x28: ffff0000363a4380
Dec 15 18:07:51 rpi kernel: [47182.438569] x27: 00000000ffffffff x26: ffff00002bf0f680
Dec 15 18:07:51 rpi kernel: [47182.438579] x25: ffffd97df4309018 x24: ffff00002bf0f740
Dec 15 18:07:51 rpi kernel: [47182.438588] x23: ffff0000352cf45c x22: ffff0000352cf000
Dec 15 18:07:51 rpi kernel: [47182.438598] x21: ffff0000352cf480 x20: ffffd97df4607000
Dec 15 18:07:51 rpi kernel: [47182.438607] x19: 0000000000000000 x18: 0000000000000000
Dec 15 18:07:51 rpi kernel: [47182.438616] x17: 0000000000000000 x16: 0000000000000000
Dec 15 18:07:51 rpi kernel: [47182.438626] x15: ffff000035a322f0 x14: ffffffffffffffff
Dec 15 18:07:51 rpi kernel: [47182.438636] x13: 0000000000000000 x12: ffffd97df4742000
Dec 15 18:07:51 rpi kernel: [47182.438646] x11: ffffd97df462c000 x10: ffffd97df4742a80
Dec 15 18:07:51 rpi kernel: [47182.438655] x9 : 0000000000000000 x8 : 0000000000000004
Dec 15 18:07:51 rpi kernel: [47182.438663] x7 : 0000000000000000 x6 : 0000000000000000
Dec 15 18:07:51 rpi kernel: [47182.438672] x5 : 0000000000000000 x4 : 0000000000000002
Dec 15 18:07:51 rpi kernel: [47182.438681] x3 : ffffd97df3c15790 x2 : 0000000000000040
Dec 15 18:07:51 rpi kernel: [47182.438689] x1 : 0000000000000000 x0 : 0000000000000000
Dec 15 18:07:51 rpi kernel: [47182.438699] Call trace:
Dec 15 18:07:51 rpi kernel: [47182.438708] dev_watchdog+0x370/0x378
Dec 15 18:07:51 rpi kernel: [47182.438720] call_timer_fn+0x40/0x1e8
Dec 15 18:07:51 rpi kernel: [47182.438729] run_timer_softirq+0x1d4/0x590
Dec 15 18:07:51 rpi kernel: [47182.438738] __do_softirq+0x170/0x424
Dec 15 18:07:51 rpi kernel: [47182.438748] irq_exit+0xb4/0xe8
Dec 15 18:07:51 rpi kernel: [47182.438760] __handle_domain_irq+0x74/0xc8
Dec 15 18:07:51 rpi kernel: [47182.438768] bcm2836_arm_irqchip_handle_irq+0x78/0xf0
Dec 15 18:07:51 rpi kernel: [47182.438775] el1_irq+0x108/0x200
Dec 15 18:07:51 rpi kernel: [47182.438784] arch_cpu_idle+0x40/0x238
Dec 15 18:07:51 rpi kernel: [47182.438793] default_idle_call+0x28/0x6c
Dec 15 18:07:51 rpi kernel: [47182.438805] do_idle+0x214/0x2a0
Dec 15 18:07:51 rpi kernel: [47182.438813] cpu_startup_entry+0x2c/0x78
Dec 15 18:07:51 rpi kernel: [47182.438825] secondary_start_kernel+0x18c/0x1c8
Dec 15 18:07:51 rpi kernel: [47182.438833] ---[ end trace 8fa731254680f7cd ]---
- 简单重启即可恢复全部功能
- 崩溃似乎每隔一天半左右就会发生一次(尚不知道它是否总是在准确的时间)
- 这周很忙,我尝试了一个临时的解决方法,即安排每天凌晨 4 点重启软件,这可能会防止 Pi 崩溃,但没有成功。似乎需要进行完整的电源循环。
我的理解
据我了解,该问题与 eth0 及其链接模块有关,这可以解释为什么无法远程访问 Pi,但服务仍在运行。除此之外,我不确定要采取哪些步骤来解决问题,任何帮助都将不胜感激。如果我需要附加更多日志,请告诉我。
非常感谢您阅读我的文章,让我们解决这个问题!
_cilusse
答案1
答案2
更新后问题似乎消失了!很高兴它能正常工作,如果有人遇到同样的问题,请在接下来的几周内保持系统更新,问题很可能会得到解决。
一切顺利,
...编辑:问题又回来了,他们正在努力修复。