这个 /dead.letter 文件谈论 SMART 警告是什么意思?

这个 /dead.letter 文件谈论 SMART 警告是什么意思?

我刚刚dead.letter在我的根目录中发现了两个月前的这个文件:

$ ll /dead.letter 
-rw------- 1 root root      638 Sep 23 02:44 /dead.letter

其内容如下:

Date: Fri, 23 Sep 2016 02:44:47 +0200
To: root
Subject: SMART error (FailedOpenDevice) detected on host:
 BC-AlkaliMetal
User-Agent: s-nail v14.8.6

This message was generated by the smartd daemon running on:

   host name:  BC-AlkaliMetal
   DNS domain: [Empty]

The following warning/error was logged by the smartd daemon:

Device: /dev/sda [SAT], unable to open device

Device info:
WDC WD10JPVX-22JC3T0, S/N:WD-WXH1E65DXFLK, WWN:5-0014ee-65bab5da7, FW:01.01A01, 1.00 TB

For details see host's SYSLOG.

You can also use the smartctl utility for further investigation.
Another message will be sent in 24 hours if the problem persists.

由于该文件已经使用近两个月了,很遗憾,我无法再知道该文件创建的具体情况。不过,我的笔记本使用时间不到一年,目前报告的 SMART 状态gnome-disks都很好,简短的自检也成功运行。

那么这一切意味着什么,为什么我有这个文件,我是否需要因为它所说的错误/警告而担心?

答案1

  • 我今天刚看到一个,很好奇里面是什么dead.letter(是黑客写的吗?:D)。类似内容:

    Date: Thu, 08 Dec 2016 00:48:26 +0100
    To: root
    Subject: SMART error (FailedOpenDevice) detected on host:
     user.dz-blueskies
    User-Agent: s-nail v14.8.6
    
    This message was generated by the smartd daemon running on:
    
       host name:  user.dz-blueskies
       DNS domain: [Empty]
    
    The following warning/error was logged by the smartd daemon:
    
    Device: /dev/sdb [SAT], unable to open device
    
    Device info:
    WDC WD20NMVW-11W68S0, S/N:WD-WX51A82P0486, WWN:5-0014ee-25cb067e3, FW:01.01A01, 
    2.00 TB
    
    For details see host's SYSLOG.
    
    You can also use the smartctl utility for further investigation.
    Another message will be sent in 24 hours if the problem persists.
    

    User-Agent: s-nail是邮件工具,man s-nail | grep -n dead经检查发现正如 Rinzwind 所说。

    2334:     DEAD    The name of the file to use for saving aborted messages if save is set; this defaults to dead.letter in the user's HOME directory.
    2507:               DEAD=+dead.mbox
    

    smartd配置为向用户发送电子邮件root,发件人为/etc/smartd.conf

    DEVICESCAN -d removable -n standby -m root -M exec /usr/share/smartmontools/smartd-runner
    

    错误是关于smartd无法找到可访问的设备节点/dev/sdb(在我的情况下是外部 USB 驱动器)。可能在未完全弹出后,我无法测试,因为我必须等待 30 分钟才能进行下一次 smartd 扫描。

    $ grep smartd  /var/log/syslog
    Dec  8 00:48:26 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 117 to 109
    Dec  8 00:48:26 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 00:48:26 user.dz-blueskies smartd[1086]: Sending warning via /usr/share/smartmontools/smartd-runner to root ...
    Dec  8 00:48:26 user.dz-blueskies smartd[1086]: Warning via /usr/share/smartmontools/smartd-runner to root produced unexpected output (118 bytes) to STDOUT/STDERR:
    Dec  8 00:48:26 user.dz-blueskies smartd[1086]: /etc/smartmontools/run.d/10mail:
    Dec  8 00:48:26 user.dz-blueskies smartd[1086]: Cannot start "/usr/sbin/sendmail": executable not found (adjust *sendmail* variable)
    Dec  8 00:48:26 user.dz-blueskies smartd[1086]: Warning via /usr/share/smartmontools/smartd-runner to root: successful
    Dec  8 01:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 109 to 107
    Dec  8 01:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 01:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 107 to 106
    Dec  8 01:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 02:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 02:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 03:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 03:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 04:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 106 to 107
    Dec  8 04:18:26 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 04:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 107 to 106
    Dec  8 04:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 05:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 05:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 06:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 106 to 108
    Dec  8 06:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 06:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 108 to 109
    Dec  8 06:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 07:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 07:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 08:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 08:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 09:18:26 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 09:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 109 to 110
    Dec  8 09:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 10:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 110 to 108
    Dec  8 10:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 10:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 108 to 109
    Dec  8 10:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 11:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 11:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 109 to 108
    Dec  8 11:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 12:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 108 to 109
    Dec  8 12:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 12:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 109 to 108
    Dec  8 12:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 13:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 108 to 109
    Dec  8 13:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 13:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 14:18:26 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open() failed: No such device
    Dec  8 14:48:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], open device worked again, warning condition reset after 1 email
    Dec  8 14:48:26 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 120 to 128
    Dec  8 15:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], SMART Prefailure Attribute: 3 Spin_Up_Time changed from 211 to 210
    Dec  8 15:18:25 user.dz-blueskies smartd[1086]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 128 to 121
    

    顺便说一句,这些温度测量似乎不正确(摄氏度),华氏度更真实。

  • 即使安全弹出,我也能收到相同的系统日志消息,smartd 会在启动时扫描设备,然后在弹出后继续监控它们。

    您的消息是 2016 年 9 月 23 日的旧消息,已近 3 个月,没有留下任何日志,问题没有重复,也没有待处理的邮件。我认为我们需要一种方法来重现它,可以通过减少 smartd 间隔,编辑其 systemd 服务文件。

    $ sudo vim /lib/systemd/system/smartd.service
    ExecStart=/usr/sbin/smartd -n -i 10 $smartd_opts
    

    -i 10间隔 10 秒。但是,请注意,smartd 将其用作数据查询的严格值,而不是报告值(我注意到消息之间的间隔为 11 秒到 5 分钟)。

    重启服务:

    sudo systemctl daemon-reload
    sudo systemctl restart smartd
    

    要在终端中使用以下命令:

    tail -f /var/log/syslog | grep smartd
    

相关内容