我在 MacBookPro11,5 上运行 Ubuntu 18.04.4,使用默认的 GNOME 桌面,并且一直出现死机,除了移动鼠标指针外什么都做不了,对 alt+F2、Ctrl+Alt+F2 或 Ctrl+Alt+F3 没有响应,我必须硬重启。据我从 /var/log/syslog 推断,罪魁祸首是一致的:它是一个名为 ttm_page_pool_get_pages 的函数,最终在 entry_SYSCALL_64_after_hwframe 处崩溃。这听起来像是相当低级的内存管理,远远超出了我的 Linux 专业知识。这是一个典型案例(我没有进行差异化以查看它是否始终是完全相同的消息序列,但它总是看起来: 肉眼看来也一样:
Mar 29 10:05:01 McBk CRON[1276]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Mar 29 10:05:32 McBk kernel: [69240.405460] general protection fault: 0000 [#1] SMP PTI
Mar 29 10:05:32 McBk kernel: [69240.405464] CPU: 0 PID: 2151 Comm: Xorg Tainted: G OE 5.3.0-42-generic #34~18.04.1-Ubuntu
Mar 29 10:05:32 McBk kernel: [69240.405465] Hardware name: Apple Inc. MacBookPro11,5/Mac-06F11F11946D27C5, BIOS MBP114.88Z.0184.B00.1806051659 06/05/2018
Mar 29 10:05:32 McBk kernel: [69240.405470] RIP: 0010:clear_page_erms+0x7/0x10
Mar 29 10:05:32 McBk kernel: [69240.405472] Code: 48 89 47 18 48 89 47 20 48 89 47 28 48 89 47 30 48 89 47 38 48 8d 7f 40 75 d9 90 c3 0f 1f 80 00 00 00 00 b9 00 10 00 00 31 c0 <f3> aa c3 90 90 90 90 90 90 55 48 85 ff 48 89 e5 0f 84 fd 00 00 00
Mar 29 10:05:32 McBk kernel: [69240.405473] RSP: 0018:ffffa992c36537f8 EFLAGS: 00010246
Mar 29 10:05:32 McBk kernel: [69240.405474] RAX: 0000000000000000 RBX: ffff9a6e23b1d090 RCX: 0000000000001000
Mar 29 10:05:32 McBk kernel: [69240.405475] RDX: ffff9a6c7929c350 RSI: 0000000000000246 RDI: ffed64780a70d000
Mar 29 10:05:32 McBk kernel: [69240.405476] RBP: ffffa992c3653868 R08: 0000000000000201 R09: ffff9a6e23b1d098
Mar 29 10:05:32 McBk kernel: [69240.405476] R10: 0000000000000000 R11: ffff9a6e23b1d090 R12: 0000000000000000
Mar 29 10:05:32 McBk kernel: [69240.405477] R13: 0000000000000040 R14: 0000000000000000 R15: ffffa992c36538a8
Mar 29 10:05:32 McBk kernel: [69240.405478] FS: 00007f8558727a80(0000) GS:ffff9a6e2f000000(0000) knlGS:0000000000000000
Mar 29 10:05:32 McBk kernel: [69240.405479] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 29 10:05:32 McBk kernel: [69240.405480] CR2: 00007f8509e7f1c0 CR3: 000000040c614003 CR4: 00000000001606f0
Mar 29 10:05:32 McBk kernel: [69240.405481] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Mar 29 10:05:32 McBk kernel: [69240.405481] DR3: 0000000000000080 DR6: 00000000ffff0ff0 DR7: 0000000020000400
Mar 29 10:05:32 McBk kernel: [69240.405482] Call Trace:
Mar 29 10:05:32 McBk kernel: [69240.405489] ? ttm_page_pool_get_pages+0x1ff/0x380 [ttm]
Mar 29 10:05:32 McBk kernel: [69240.405492] ? si_mem_available+0x5b/0xe0
Mar 29 10:05:32 McBk kernel: [69240.405495] ttm_pool_populate+0xfd/0x4a0 [ttm]
Mar 29 10:05:32 McBk kernel: [69240.405497] ttm_populate_and_map_pages+0x28/0x260 [ttm]
Mar 29 10:05:32 McBk kernel: [69240.405513] radeon_ttm_tt_populate+0x87/0x150 [radeon]
Mar 29 10:05:32 McBk kernel: [69240.405515] ttm_tt_populate.part.9+0x22/0x60 [ttm]
Mar 29 10:05:32 McBk kernel: [69240.405517] ttm_tt_bind+0x51/0x60 [ttm]
Mar 29 10:05:32 McBk kernel: [69240.405520] ttm_bo_handle_move_mem+0x4f2/0x5b0 [ttm]
Mar 29 10:05:32 McBk kernel: [69240.405522] ? ttm_bo_mem_space+0x17e/0x2f0 [ttm]
Mar 29 10:05:32 McBk kernel: [69240.405525] ttm_bo_validate+0x122/0x140 [ttm]
Mar 29 10:05:32 McBk kernel: [69240.405536] ? drm_mode_set_crtcinfo+0x56/0x1b0 [drm]
Mar 29 10:05:32 McBk kernel: [69240.405539] ttm_bo_init_reserved+0x3a4/0x440 [ttm]
Mar 29 10:05:32 McBk kernel: [69240.405541] ttm_bo_init+0x6b/0x110 [ttm]
Mar 29 10:05:32 McBk kernel: [69240.405551] ? radeon_update_memory_usage.isra.7+0x50/0x50 [radeon]
Mar 29 10:05:32 McBk kernel: [69240.405560] radeon_bo_create+0x16e/0x200 [radeon]
Mar 29 10:05:32 McBk kernel: [69240.405569] ? radeon_update_memory_usage.isra.7+0x50/0x50 [radeon]
Mar 29 10:05:32 McBk kernel: [69240.405579] radeon_gem_object_create+0xb0/0x190 [radeon]
Mar 29 10:05:32 McBk kernel: [69240.405590] radeon_gem_create_ioctl+0x6b/0x100 [radeon]
Mar 29 10:05:32 McBk kernel: [69240.405596] ? drm_gem_handle_delete+0x74/0x90 [drm]
Mar 29 10:05:32 McBk kernel: [69240.405606] ? radeon_gem_pwrite_ioctl+0x30/0x30 [radeon]
Mar 29 10:05:32 McBk kernel: [69240.405612] drm_ioctl_kernel+0xb0/0x100 [drm]
Mar 29 10:05:32 McBk kernel: [69240.405618] drm_ioctl+0x389/0x450 [drm]
Mar 29 10:05:32 McBk kernel: [69240.405628] ? radeon_gem_pwrite_ioctl+0x30/0x30 [radeon]
Mar 29 10:05:32 McBk kernel: [69240.405631] ? unmap_region+0xf7/0x130
Mar 29 10:05:32 McBk kernel: [69240.405637] radeon_drm_ioctl+0x4f/0x80 [radeon]
Mar 29 10:05:32 McBk kernel: [69240.405640] do_vfs_ioctl+0xa9/0x640
Mar 29 10:05:32 McBk kernel: [69240.405642] ksys_ioctl+0x75/0x80
Mar 29 10:05:32 McBk kernel: [69240.405643] __x64_sys_ioctl+0x1a/0x20
Mar 29 10:05:32 McBk kernel: [69240.405646] do_syscall_64+0x5a/0x130
Mar 29 10:05:32 McBk kernel: [69240.405647] entry_SYSCALL_64_after_hwframe+0x44/0xa9
有人知道如何修复/预防这个问题吗?
后来根据建议添加了以下内容:我们真正想看到的是这些查询在冻结之前立即执行的结果,但由于没有办法获取这些数据,我想只能这样做了!我更加谨慎,不会让每个应用程序打开太多文件,也不会同时运行许多应用程序,自从我开始这样做以来,我再也没有遇到过冻结的情况,所以这可能低估了我当时使用的系统资源的使用情况,但并不彻底: 我从
dgibson@McBk:~$ free -h total used free shared buff/cache available Mem: 15G 5.4G 6.2G 682M 3.9G 9.1G Swap: 2.0G 0B 2.0G dgibson@McBk:~$ sudo lshw -C memory [sudo] password for dgibson: *-cache:0
memtest86.com,并且它以零错误通过了全部测试。
description: L1 cache physical id: 0 slot: L1 Cache size: 32KiB capacity: 32KiB capabilities: synchronous internal write-back data configuration: level=1 *-cache:1 description: L1 cache physical id: 1 slot: L1 Cache size: 32KiB capacity: 32KiB capabilities: synchronous internal write-back instruction configuration: level=1 *-cache:2 description: L2 cache physical id: 2 slot: L2 Cache size: 256KiB capacity: 256KiB capabilities: synchronous internal write-back unified configuration: level=2 *-cache:3 description: L3 cache physical id: 3 slot: L3 Cache size: 6MiB capacity: 6MiB capabilities: synchronous internal write-back unified configuration: level=3 *-cache:4 description: L4 cache physical id: 4 slot: Unknown size: 128MiB capacity: 128MiB capabilities: synchronous internal write-back unified configuration: level=4 *-firmware description: BIOS vendor: Apple Inc. physical id: 6 version: MBP114.88Z.0184.B00.1806051659 date: 06/05/2018 size: 1MiB capacity: 8128KiB capabilities: pci upgrade shadowing cdboot bootselect acpi smartbattery netboot uefi *-memory description: System Memory physical id: 1a slot: System board or motherboard size: 16GiB *-bank:0 description: SODIMM DDR3 Synchronous 1600 MHz (0.6 ns) product: HMT41GS6BFR8A-PB vendor: Hynix Semiconductor (Hyundai Electronics) physical id: 0 serial: 0x00000000 slot: DIMM0 size: 8GiB width: 64 bits clock: 1600MHz (0.6ns) *-bank:1 description: SODIMM DDR3 Synchronous 1600 MHz (0.6 ns) product: HMT41GS6BFR8A-PB vendor: Hynix Semiconductor (Hyundai Electronics) physical id: 1 serial: 0x00000000 slot: DIMM0 size: 8GiB width: 64 bits clock: 1600MHz (0.6ns) dgibson@McBk:~$