如何找出 Ubuntu 20.04 冻结的原因?

如何找出 Ubuntu 20.04 冻结的原因?

两周前,我在我的 Amd Ryzen 5 1600 / Nvidia Gtx 1070 上安装了 Ubuntu 20.04,但 Ubuntu 有时会完全冻结。

键盘和屏幕完全停止工作,鼠标有时可以继续移动。我试过使用神奇的 SysRq 键,但没有用。也试过alt+ F1,但系统也没有响应。基本上我只能按电源按钮才能重新启动。

我怀疑是 Nvidia,但我不知道如何验证。

nvidia-smi显示驱动程序版本 440.100。

发现这些日志/var/log/Xorg.1.log.old显示了我的计算机崩溃的时间。

[  1223.234] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-22ms), your system is too slow  
[  1223.234] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-35ms), your system is too slow  
[  1488.529] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-0ms), your system is too slow  
[  1488.529] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-13ms), your system is too slow  
[  5125.223] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-14ms), your system is too slow  
[  5125.223] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-27ms), your system is too slow  
[  6038.321] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-9ms), your system is too slow  
[  6206.894] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-3ms), your system is too slow  
[  6206.894] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-16ms), your system is too slow  
[  6409.650] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-9ms), your system is too slow  
[  6409.650] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-22ms), your system is too slow  
[ 10930.426] (EE) client bug: timer event2 debounce: scheduled expiry is in the past (-7ms), your system is too slow  
[ 10930.426] (EE) client bug: timer event2 debounce short: scheduled expiry is in the past (-20ms), your system is too slow  

free -h结果:

              total        used        free      shared  buff/cache   available
Mem:           15Gi       2.5Gi        11Gi       393Mi       1.9Gi        12Gi
Swap:         2.0Gi          0B       2.0Gi

sysctl vm.swappiness结果:

vm.swappiness = 60

sudo lshw -C memory结果:

  *-firmware                
       description: BIOS
       vendor: American Megatrends Inc.
       physical id: 0
       version: 1.L0
       date: 12/28/2018
       size: 64KiB
       capacity: 16MiB
       capabilities: pci upgrade shadowing cdboot bootselect socketedrom edd int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int9keyboard int14serial int17printer acpi usb biosbootspecification uefi
  *-memory
       description: System Memory
       physical id: f
       slot: System board or motherboard
       size: 16GiB
     *-bank:0
          description: 2933 MHz (0.3 ns) [empty]
          product: Unknown
          vendor: Unknown
          physical id: 0
          serial: Unknown
          slot: DIMM 0
          clock: 2933MHz (0.3ns)
     *-bank:1
          description: DIMM DDR4 Synchronous Unbuffered (Unregistered) 2933 MHz (0.3 ns)
          product: CMK16GX4M2B3200C16
          vendor: Unknown
          physical id: 1
          serial: 00000000
          slot: DIMM 1
          size: 8GiB
          width: 64 bits
          clock: 2933MHz (0.3ns)
     *-bank:2
          description: 2933 MHz (0.3 ns) [empty]
          product: Unknown
          vendor: Unknown
          physical id: 2
          serial: Unknown
          slot: DIMM 0
          clock: 2933MHz (0.3ns)
     *-bank:3
          description: DIMM DDR4 Synchronous Unbuffered (Unregistered) 2933 MHz (0.3 ns)
          product: CMK16GX4M2B3200C16
          vendor: Unknown
          physical id: 3
          serial: 00000000
          slot: DIMM 1
          size: 8GiB
          width: 64 bits
          clock: 2933MHz (0.3ns)
  *-cache:0
       description: L1 cache
       physical id: 11
       slot: L1 - Cache
       size: 576KiB
       capacity: 576KiB
       clock: 1GHz (1.0ns)
       capabilities: pipeline-burst internal write-back unified
       configuration: level=1
  *-cache:1
       description: L2 cache
       physical id: 12
       slot: L2 - Cache
       size: 3MiB
       capacity: 3MiB
       clock: 1GHz (1.0ns)
       capabilities: pipeline-burst internal write-back unified
       configuration: level=2
  *-cache:2
       description: L3 cache
       physical id: 13
       slot: L3 - Cache
       size: 16MiB
       capacity: 16MiB
       clock: 1GHz (1.0ns)
       capabilities: pipeline-burst internal write-back unified
       configuration: level=3

grep -i swap /etc/fstab结果:

/swapfile                                 none            swap    sw              0       0

sudo dmidecode -s bios-version结果:

1.L0

添加软件与更新截屏

软件与更新

8 月 6 日更新:

崩溃文件,终端中列出的 gnome shell 扩展

答案1

BIOS

MSI B350 战斧

您的 BIOS 版本为 1.L0,发布日期为 2018 年 12 月 28 日。

有更新的 BIOS 可用这里。编号/命名约定与您现在的约定不同,这很不寻常。请联系 MSI 支持并询问此事。

在此处输入图片描述

笔记:确认我拥有适合您型号主板的正确网页。

笔记:请勿下载/使用/安装最新的 BETA 版本。

笔记:更新 BIOS 之前请做好备份。


交换

我们将您的 /swapfile 从 2G 增加到 4G。

笔记:命令使用不当dd可能导致数据丢失。建议复制/粘贴。

sudo swapoff -a           # turn off swap
sudo rm -i /swapfile      # remove old /swapfile

sudo dd if=/dev/zero of=/swapfile bs=1M count=4096

sudo chmod 600 /swapfile  # set proper file protections
sudo mkswap /swapfile     # init /swapfile
sudo swapon /swapfile     # turn on swap
free -h                   # confirm 16G RAM and 4G swap
reboot                    # reboot and verify operation

将此行添加到 /etc/fstab...

/swapfile    none    swap    sw      0   0

英伟达

您拥有 Nvidia 版本 440.100。

Software & Updates显示这是当前版本。不过,还有更新的版本 450.57 可供下载这里

在此处输入图片描述 在此处输入图片描述

笔记:更新 Nvidia 驱动程序之前请做好备份。

更新#1:

由于您必须强制关闭计算机,让我们检查一下您的文件系统......

  • 以“试用 Ubuntu”模式启动 Ubuntu Live DVD/USB
  • terminalCtrl+ Alt+打开窗口T
  • 类型sudo fdisk -l
  • 识别“Linux 文件系统”的 /dev/sdXX 设备名称
  • 输入sudo fsck -f /dev/sdXX,替换sdXX为您之前找到的数字
  • fsck如果有错误则重复命令
  • 类型reboot

更新 #2:

安装 BIOS 更新...但请先联系 MSI 支持,以确认您需要哪个 BIOS 更新文件版本...因为它们的命名约定似乎已更改。

您安装了许多 GNOME Shell 扩展,其中任何一个都可能导致冻结,并且它们安装在“错误”的位置,因为它们是系统范围内安装的,而不是用户特定的。您可以在 /usr/share/gnome-shell/extensions 目录列表中看到它们,它们都以 gcampax.github.com 结尾。

最安全的去除方法是https://extensions.gnome.org/local/并删除除这三个扩展之外的所有扩展...

drwxr-xr-x 2 root root 4.0K Jun 11 08:20 'desktop-icons@csoriano'/
drwxr-xr-x 3 root root 4.0K May 12 15:17 '[email protected]'/
drwxr-xr-x 3 root root 4.0K Jun 18 09:12 '[email protected]'/

如果系统运行正常且在一段时间内没有冻结,则请手动重新安装任何单个收藏夹,一次一个扩展,而不是安装扩展包/ zip 文件。

答案2

在类似系统(Ryzen 5 1600X、Asus B350 Plus)上,我解决这个问题的方法是禁用 BIOS 中的“自动 C 状态管理”。它在您的 BIOS 中的名称可能略有不同。

相关内容