Ubuntu 17.04 中的 CPU 硬件错误

Ubuntu 17.04 中的 CPU 硬件错误

有人能向我解释一下我在 dmesg 中看到的这些错误消息吗?我是 Ubuntu 和 Linux 世界的新手。

[ 7.802351] CPU4: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 7.802352] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 7.802353] CPU5: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 7.802354] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 7.802354] CPU4: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 7.802356] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 7.802356] mce: [Hardware Error]: Machine check events logged
[ 7.802362] mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 128: 00000000880a0003
[ 7.802363] mce: [Hardware Error]: TSC 99561677c
[ 7.802385] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1501537538 SOCKET 0 APIC 1 microcode ba
[ 7.802387] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 128: 00000000880a0003
[ 7.802387] mce: [Hardware Error]: TSC 995616be4
[ 7.802388] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1501537538 SOCKET 0 APIC 0 microcode ba
[ 7.802389] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 7.802390] CPU6: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 7.802391] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 7.802392] CPU7: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 7.826359] CPU4: Core temperature/speed normal
[ 7.826359] CPU0: Core temperature/speed normal
[ 7.826360] CPU2: Package temperature/speed normal
[ 7.826361] CPU6: Package temperature/speed normal
[ 7.826361] CPU0: Package temperature/speed normal
[ 7.826362] CPU4: Package temperature/speed normal
[ 7.826363] mce: [Hardware Error]: Machine check events logged
[ 7.826367] mce: [Hardware Error]: CPU 4: Machine Check: 0 Bank 128: 00000000880b0002
[ 7.826368] mce: [Hardware Error]: TSC 99916f004
[ 7.826369] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1501537538 SOCKET 0 APIC 1 microcode ba
[ 7.826369] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 128: 00000000880b0002
[ 7.826370] mce: [Hardware Error]: TSC 99916f2ca
[ 7.826370] mce: [Hardware Error]: PROCESSOR 0:506e3 TIME 1501537538 SOCKET 0 APIC 0 microcode ba
[ 7.826400] CPU1: Package temperature/speed normal
[ 7.826401] CPU5: Package temperature/speed normal
[ 7.826402] CPU3: Package temperature/speed normal
[ 7.826402] CPU7: Package temperature/speed normal
[ 467.922330] CPU4: Core temperature above threshold, cpu clock throttled (total events = 73)
[ 467.922331] CPU0: Core temperature above threshold, cpu clock throttled (total events = 73)
[ 467.922332] CPU7: Package temperature above threshold, cpu clock throttled (total events = 86)
[ 467.922333] CPU3: Package temperature above threshold, cpu clock throttled 

我正在运行带有 4.10.0-29-generic 内核的 Ubuntu 17.04

答案1

CPU 过热并进入 MCE(机器检查事件)...即:它正在崩溃。如果您在系统日志中没有看到其他与温度相关的事件,则可能是因为您的 CPU 冷却器/风扇/热管/导热膏没有发挥作用。

  • 使用此命令检查系统日志terminal...

    grep -i -e temp -e therm /var/log/syslog*
    
  • 如果机器很脏/满是灰尘,这可能会导致机器过热。请清理干净。

  • 如果您的机器有英特尔处理器,请确保已intel-microcode安装。

    sudo apt-get update
    sudo apt-get install intel-microcode
    reboot
    
  • 安装thermald以尝试控制温度。

    sudo apt-get update
    sudo apt-get install thermald
    reboot
    
  • 检查您的 BIOS 版本。开机时输入 BIOS,并记下版本号。访问制造商的网站,了解您的计算机品牌/型号。转到支持/下载部分,查看是否有更新的 BIOS。

  • 最后,如果这是一台较旧的机器,则很有可能需要重新涂抹位于处理器和热管/风扇冷却器之间的导热化合物。这需要一些技术经验。

相关内容