自从我安装 Ubuntu 11.04 以来,我的 Thinkpad T520 上总是随机冻结。
我很久以前就问过以下问题,但它并没有真正帮助我: 当系统冻结或崩溃无法登录时该如何调试?
这是我的 xsession.errors 文件的完整复制粘贴:
我也尝试了以下问题中的所有方法:
我尝试了 REISUB 和该问题中的其他建议,但似乎都不起作用。唯一有效的方法是重置笔记本电脑。
任何帮助都将不胜感激,如果我需要提供更多信息/日志,请问我真的很想解决这个问题。
更新
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00:16.0 Communication controller: Intel Corporation 6 Series Chipset Family MEI Controller #1 (rev 04)
00:16.3 Serial controller: Intel Corporation 6 Series Chipset Family KT Controller (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
00:1a.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 6 Series Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 1 (rev b4)
00:1c.1 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 2 (rev b4)
00:1c.3 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 4 (rev b4)
00:1c.4 PCI bridge: Intel Corporation 6 Series Chipset Family PCI Express Root Port 5 (rev b4)
00:1d.0 USB Controller: Intel Corporation 6 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation 6 Series Chipset Family LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 6 Series Chipset Family 6 port SATA AHCI Controller (rev 04)
00:1f.3 SMBus: Intel Corporation 6 Series Chipset Family SMBus Controller (rev 04)
03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)
0d:00.0 System peripheral: Ricoh Co Ltd Device e823 (rev 05)
0d:00.3 FireWire (IEEE 1394): Ricoh Co Ltd FireWire Host Controller (rev 04)
Bus 002 Device 003: ID 0bdb:1911 Ericsson Business Mobile Networks BV
Bus 002 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 006: ID 04f2:b217 Chicony Electronics Co., Ltd
Bus 001 Device 005: ID 0a5c:217f Broadcom Corp. Bluetooth Controller
Bus 001 Device 004: ID 147e:2016 Upek Biometric Touchchip/Touchstrip Fingerprint Sensor
Bus 001 Device 003: ID 045e:0737 Microsoft Corp.
Bus 001 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
答案1
我也遇到了同样的问题,查看 很容易就能找到问题的原因/var/log/syslog
。本质上,GPU 停止运行并导致 compiz 出现段错误:
Sep 9 10:29:46 helix kernel: [ 7946.237954] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer elapsed... GPU hung
Sep 9 10:29:46 helix kernel: [ 7946.250096] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request returns -11 (awaiting 3077849 at 3077840, next 3077850)
Sep 9 10:30:10 helix kernel: [ 7970.376485] compiz[1571]: segfault at 0 ip 00007f4da365b7d1 sp 00007fff1dbd5690 error 6 in i965_dri.so[7f4da35ea000+ac000]
Sep 9 10:30:15 helix kernel: [ 7975.150824] compiz[10649]: segfault at 0 ip 00007f059c445be8 sp 00007fff629e2d90 error 6 in i965_dri.so[7f059c3d4000+ac000]
Sep 9 10:30:20 helix kernel: [ 7979.892104] compiz[10671]: segfault at 0 ip 00007f1b2cd1cbe8 sp 00007fff9ef21f40 error 6 in i965_dri.so[7f1b2ccab000+ac000]
Sep 9 10:30:24 helix kernel: [ 7984.489864] compiz[10691]: segfault at 0 ip 00007f05d48debe8 sp 00007fffee43a810 error 6 in i965_dri.so[7f05d486d000+ac000]
Sep 9 10:30:29 helix kernel: [ 7989.095058] compiz[10710]: segfault at 0 ip 00007f74d0326be8 sp 00007fff09f4a480 error 6 in i965_dri.so[7f74d02b5000+ac000]
Sep 9 10:30:33 helix kernel: [ 7993.793423] compiz[10730]: segfault at 0 ip 00007fe855c9fbe8 sp 00007fff23af8570 error 6 in i965_dri.so[7fe855c2e000+ac000]
Sep 9 10:30:38 helix kernel: [ 7998.316195] compiz[10750]: segfault at 0 ip 00007fa4facb3be8 sp 00007fffe0b08c10 error 6 in i965_dri.so[7fa4fac42000+ac000]
您可以看到内核默认使用该芯片组的 i915 驱动程序:
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller])
Subsystem: Lenovo Device 21cf
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 43
Region 0: Memory at f0000000 (64-bit, non-prefetchable) [size=4M]
Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M]
Region 4: I/O ports at 6000 [size=64]
Expansion ROM at <unassigned> [disabled]
Capabilities: <access denied>
Kernel driver in use: i915
Kernel modules: i915
这是一台全新的机器,新安装了 11.04,因此与升级或类似的东西无关。
总而言之,我建议以下内容:(
apt-get install xserver-xorg-video-intel libdrm-intel1
我几乎确定你有这些)
apt-get install libdrm-intel1-dbg xserver-xorg-video-intel-dbg
然后启动内核并打开调试(drm.debug=0x06)并挂载 debugfs:
sudo mount -t debugfs debugfs /sys/kernel/debug
此外,您还可以使用 ulimit 配置系统的核心:
ulimit -c unlimited
ulimit -s unlimited
(ETC)
使用以下方法验证更改ulimit -a
当问题再次发生时,您可以/usr/bin/intel_gpu_dump
像以前一样在 GPU 挂起后使用它来获取有关 GPU 状态的更多详细信息。
/sys/kernel/debug/dri/0/i915_error_state
在事故发生后您或许能找到更多信息。
您还可以从生成的核心文件(通常在 / 下)中提取堆栈信息。
总而言之,在我看来,这看起来像是一个错误。您可以获取此信息以及软件堆栈报告并提交正式的错误报告。