我的日常驱动程序(Debian Bookworm RC3 + KDE Plasma)配置为向我发送包含错误通知的电子邮件。
今天,我收到以下电子邮件:
This message was generated by the smartd daemon running on:
host name: desk
DNS domain: local.lan
The following warning/error was logged by the smartd daemon:
Device: /dev/nvme0, number of Error Log entries increased from 1754 to 1758
Device info:
KBG30ZMV256G TOSHIBA, S/N:X8OPD1PGP12P, FW:ADHA0101
For details see host's SYSLOG.
You can also use the smartctl utility for further investigation.
The original message about this issue was sent at Wed May 17 16:09:04 2023 EDT
Another message will be sent in 24 hours if the problem persists.
这sudo journalctl -t smart
显示的是:
May 20 15:19:47 desk smartd[550]: smartd 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-9-amd64] (local build)
May 20 15:19:47 desk smartd[550]: Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
May 20 15:19:47 desk smartd[550]: Opened configuration file /etc/smartd.conf
May 20 15:19:47 desk smartd[550]: Drive: DEVICESCAN, implied '-a' Directive on line 21 of file /etc/smartd.conf
May 20 15:19:47 desk smartd[550]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
May 20 15:19:47 desk smartd[550]: Device: /dev/sda, type changed from 'scsi' to 'sat'
May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], opened
May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], CT4000MX500SSD1, S/N:2304E6A3D318, WWN:5-00a075-1e6a3d318, FW:M3CR045, 4.00 TB
May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], not found in smartd database 7.3/5319.
May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list.
May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], state read from /var/lib/smartmontools/smartd.CT4000MX500SSD1-2304E6A3D318.ata.state
May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, opened
May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, KBG30ZMV256G TOSHIBA, S/N:X8OPD1PGP12P, FW:ADHA0101
May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, is SMART capable. Adding to "monitor" list.
May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, state read from /var/lib/smartmontools/smartd.KBG30ZMV256G_TOSHIBA-X8OPD1PGP12P.nvme.state
May 20 15:19:47 desk smartd[550]: Monitoring 1 ATA/SATA, 0 SCSI/SAS and 1 NVMe devices
May 20 15:19:48 desk smartd[550]: Device: /dev/nvme0, number of Error Log entries increased from 1754 to 1758
May 20 15:19:48 desk smartd[550]: Sending warning via /usr/share/smartmontools/smartd-runner to root ...
May 20 15:19:48 desk smartd[550]: Warning via /usr/share/smartmontools/smartd-runner to root: successful
May 20 15:19:48 desk smartd[550]: Device: /dev/sda [SAT], state written to /var/lib/smartmontools/smartd.CT4000MX500SSD1-2304E6A3D318.ata.state
May 20 15:19:48 desk smartd[550]: Device: /dev/nvme0, state written to /var/lib/smartmontools/smartd.KBG30ZMV256G_TOSHIBA-X8OPD1PGP12P.nvme.state
May 20 15:49:48 desk smartd[550]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 73 to 74
May 20 22:49:48 desk smartd[550]: Device: /dev/nvme0, number of Error Log entries increased from 1758 to 1760
当我运行时sudo smartctl -i -a /dev/nvme0
,它会显示错误计数,但我不知道如何查看与增加计数相关的日志消息:
smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-9-amd64] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Number: KBG30ZMV256G TOSHIBA
Serial Number: X8OPD1PGP12P
Firmware Version: ADHA0101
PCI Vendor/Subsystem ID: 0x1179
IEEE OUI Identifier: 0x00080d
Controller ID: 0
NVMe Version: 1.2.1
Number of Namespaces: 1
Namespace 1 Size/Capacity: 256,060,514,304 [256 GB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: 00080d 04004ad9aa
Local Time is: Sat May 20 23:09:32 2023 EDT
Firmware Updates (0x12): 1 Slot, no Reset required
Optional Admin Commands (0x0017): Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0017): Comp Wr_Unc DS_Mngmt Sav/Sel_Feat
Log Page Attributes (0x02): Cmd_Eff_Lg
Maximum Data Transfer Size: 512 Pages
Warning Comp. Temp. Threshold: 82 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 3.30W - - 0 0 0 0 0 0
1 + 2.70W - - 1 1 1 1 0 0
2 + 2.30W - - 2 2 2 2 0 0
3 - 0.0500W - - 4 4 4 4 8000 32000
4 - 0.0050W - - 4 4 4 4 8000 40000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 - 4096 0 0
1 + 512 0 3
=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 32 Celsius
Available Spare: 100%
Available Spare Threshold: 10%
Percentage Used: 30%
Data Units Read: 23,188,612 [11.8 TB]
Data Units Written: 39,727,036 [20.3 TB]
Host Read Commands: 222,771,983
Host Write Commands: 498,052,687
Controller Busy Time: 7,440
Power Cycles: 291
Power On Hours: 20,378
Unsafe Shutdowns: 615
Media and Data Integrity Errors: 0
Error Information Log Entries: 1,760
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 32 Celsius
Error Information (NVMe Log 0x01, 16 of 64 entries)
Num ErrCount SQId CmdId Status PELoc LBA NSID VS
0 1760 0 0x501a 0xc005 0x028 - 1 -
1 1759 0 0xb012 0xc005 0x028 - 1 -
2 1758 0 0x5010 0xc005 0x028 - 0 -
我怎样才能找出错误是什么?