系统每隔一两周就会挂起

系统每隔一两周就会挂起

我正在运行 Ubuntu 20.04,我的系统大约每两周就会挂起一次。它今天崩溃了,我保存了整个 /var/log 目录。

我应该提供哪些信息来帮助解决这个崩溃问题?

有没有关于哪些问题会导致 Linux 系统挂起的总结?

我还没碰过 bios。

sudo snap list:
anna_user2@anna-XPS-8930:~$ sudo snap list
Name                  Version                     Rev    Tracking         Publisher     Notes
atom                  1.57.0                      282    latest/stable    snapcrafters  classic
bare                  1.0                         5      latest/stable    canonical✓    base
canonical-livepatch   10.1.2                      126    latest/stable    canonical✓    -
coq-prover            2021-09-0                   27     latest/stable    coq-team      -
core                  16-2.54.2                   12603  latest/stable    canonical✓    core
core18                20211215                    2284   latest/stable    canonical✓    base
core20                20220114                    1328   latest/stable    canonical✓    base
gnome-3-28-1804       3.28.0-19-g98f9e67.98f9e67  161    latest/stable    canonical✓    -
gnome-3-34-1804       0+git.3556cb3               77     latest/stable/…  canonical✓    -
gnome-3-38-2004       0+git.1f9014a               99     latest/stable    canonical✓    -
gtk-common-themes     0.1-59-g7bca6ae             1519   latest/stable/…  canonical✓    -
jq                    1.5+dfsg-1                  6      latest/stable    mvo           -
postman               7.36.5                      133    latest/stable    postman-inc✓  -
ruby                  3.1.0                       247    latest/stable    rubylang✓     classic
simplescreenrecorder  0.1                         1      latest/stable    xiaoguo       -
smplayer              21.10.0                     43     latest/stable    rvm           -
snap-store            3.38.0-66-gbd5b8f7          558    latest/stable/…  canonical✓    -
snapd                 2.54.2                      14549  latest/stable    canonical✓    snapd
anna_user2@anna-XPS-8930:~$ 

答案1

您遇到的问题可能有多种原因。您能否分享日志,/var/log/syslog或者/var/log/kern.log检查设备开启时机器是否出现任何错误dmesg -T

另一个原因可能是您的硬件。可能存在一些硬件问题,这可能会导致系统崩溃、突然挂起等问题。

您可以尝试提供更多的日志报告,以便其他人能够准确了解问题所在。

选项1

服务失败

有些服务可能会反复崩溃/失败,这可能是系统崩溃的原因。您可以检查您的服务是否出现此类问题。

例子:

systemctl list-units --type=service

系统控制 在这里您可以看到所有失败或活跃的服务。

选项 2

硬件故障

可能有一些硬件无法正常工作,例如:CPU 风扇、显示显卡、RAM 等。您可以在日志中检查Dmesg,可以在那里看到是否存在任何硬件错误以及 GPU 手或分段错误等错误。

例子: dmesg -T

[Fri Jan 21 18:02:27 2022] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=20339577 end=20339578) time 439 us, min 2146, max 2159, scanline start 2136, end 2196
[Fri Jan 21 18:03:29 2022] [drm] Got external EDID base block and 1 extension from "edid/edid.bin" for connector "DP-1"
[Fri Jan 21 18:03:29 2022] [drm] Got external EDID base block and 1 extension from "edid/edid.bin" for connector "DP-1"
[Fri Jan 21 18:37:28 2022] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=20465675 end=20465676) time 417 us, min 2146, max 2159, scanline start 2144, end 2200
[Fri Jan 21 19:03:59 2022] [drm] Got external EDID base block and 1 extension from "edid/edid.bin" for connector "DP-1"
[Fri Jan 21 19:04:00 2022] [drm] Got external EDID base block and 1 extension from "edid/edid.bin" for connector "DP-1"
[Fri Jan 21 20:41:15 2022] i915 0000:00:02.0: GPU HANG: ecode 9:4:0xc86dffef, in vlc [1478503], hang on vcs0
[Fri Jan 21 20:41:15 2022] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0
[Fri Jan 21 21:03:04 2022] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=20989841 end=20989842) time 412 us, min 2146, max 2159, scanline start 2142, end 2198
[Fri Jan 21 22:13:11 2022] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0
[Fri Jan 21 22:43:59 2022] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0
[Fri Jan 21 23:49:47 2022] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0
[Sat Jan 22 01:00:43 2022] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0
[Sat Jan 22 03:39:16 2022] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=22416118 end=22416119) time 413 us, min 2146, max 2159, scanline start 2117, end 2173
[Sat Jan 22 06:05:57 2022] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0
[Sat Jan 22 06:51:33 2022] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=23108341 end=23108342) time 408 us, min 2146, max 2159, scanline start 2141, end 2196
[Sat Jan 22 08:01:12 2022] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0
[Sat Jan 22 08:36:01 2022] [drm:intel_pipe_update_end [i915]] *ERROR* Atomic update failure on pipe A (start=23484460 end=23484461) time 433 us, min 2146, max 2159, scanline start 2102, end 2160
[Sat Jan 22 10:06:46 2022] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0
[Sat Jan 22 10:10:56 2022] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0
[Sat Jan 22 11:44:18 2022] i915 0000:00:02.0: Resetting vcs0 for hang on vcs0

或硬件错误,例如

[Mon Feb 14 13:41:45 2022] evm: security.ima
[Mon Feb 14 13:41:45 2022] evm: security.capability
[Mon Feb 14 13:41:45 2022] evm: HMAC attrs: 0x1
[Mon Feb 14 13:41:45 2022] BERT: Error records from previous boot:
[Mon Feb 14 13:41:45 2022] [Hardware Error]: event severity: fatal
[Mon Feb 14 13:41:45 2022] [Hardware Error]:  Error 0, type: fatal
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   section type: unknown, 81212a96-09ed-4996-9471-8d729c8e69ed
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   section length: 0x290
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000000: 00000001 00000000 00000000 00020002  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000010: 00020002 00000001 0000031d 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000020: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000030: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000040: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000050: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000060: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000070: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000080: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000090: 0012cf23 00000000 00000002 00000001  #...............
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   000000a0: 0000031d 00000000 00040000 000ffff8  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   000000b0: 000014a8 00000880 00000880 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   000000c0: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   000000d0: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   000000e0: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   000000f0: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000100: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000110: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000120: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000130: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000140: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000150: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000160: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000170: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000180: 00000000 00000000 00000000 00000000  ................
[Mon Feb 14 13:41:45 2022] [Hardware Error]:   00000190: 00000000 00000000 00000000 00000000  ................

选项 3

重新安装操作系统

您可以尝试重新安装操作系统并检查是否遇到相同的问题。如果“是”,则可能是硬件问题。

答案2

我使用了一个命令(我认为是 systemd)来查找崩溃后需要 10 分钟才能重新启动的原因。它指示 AI 编码辅助“kite”。我删除了 kite,我的系统不仅快速启动,而且不再崩溃。

相关内容