当 smartclt 显示有错误时,如何在 Linux 中查看 NVMe 磁盘的智能日志?

当 smartclt 显示有错误时,如何在 Linux 中查看 NVMe 磁盘的智能日志?

我的日常驱动程序(Debian Bookworm RC3 + KDE Plasma)配置为向我发送包含错误通知的电子邮件。

今天,我收到以下电子邮件:

This message was generated by the smartd daemon running on:

   host name:  desk
   DNS domain: local.lan

The following warning/error was logged by the smartd daemon:

Device: /dev/nvme0, number of Error Log entries increased from 1754 to 1758

Device info:
KBG30ZMV256G TOSHIBA, S/N:X8OPD1PGP12P, FW:ADHA0101

For details see host's SYSLOG.

You can also use the smartctl utility for further investigation.
The original message about this issue was sent at Wed May 17 16:09:04 2023 EDT
Another message will be sent in 24 hours if the problem persists.

sudo journalctl -t smart显示的是:

May 20 15:19:47 desk smartd[550]: smartd 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-9-amd64] (local build)
May 20 15:19:47 desk smartd[550]: Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org
May 20 15:19:47 desk smartd[550]: Opened configuration file /etc/smartd.conf
May 20 15:19:47 desk smartd[550]: Drive: DEVICESCAN, implied '-a' Directive on line 21 of file /etc/smartd.conf
May 20 15:19:47 desk smartd[550]: Configuration file /etc/smartd.conf was parsed, found DEVICESCAN, scanning devices
May 20 15:19:47 desk smartd[550]: Device: /dev/sda, type changed from 'scsi' to 'sat'
May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], opened
May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], CT4000MX500SSD1, S/N:2304E6A3D318, WWN:5-00a075-1e6a3d318, FW:M3CR045, 4.00 TB
May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], not found in smartd database 7.3/5319.
May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], is SMART capable. Adding to "monitor" list.
May 20 15:19:47 desk smartd[550]: Device: /dev/sda [SAT], state read from /var/lib/smartmontools/smartd.CT4000MX500SSD1-2304E6A3D318.ata.state
May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, opened
May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, KBG30ZMV256G TOSHIBA, S/N:X8OPD1PGP12P, FW:ADHA0101
May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, is SMART capable. Adding to "monitor" list.
May 20 15:19:47 desk smartd[550]: Device: /dev/nvme0, state read from /var/lib/smartmontools/smartd.KBG30ZMV256G_TOSHIBA-X8OPD1PGP12P.nvme.state
May 20 15:19:47 desk smartd[550]: Monitoring 1 ATA/SATA, 0 SCSI/SAS and 1 NVMe devices
May 20 15:19:48 desk smartd[550]: Device: /dev/nvme0, number of Error Log entries increased from 1754 to 1758
May 20 15:19:48 desk smartd[550]: Sending warning via /usr/share/smartmontools/smartd-runner to root ...
May 20 15:19:48 desk smartd[550]: Warning via /usr/share/smartmontools/smartd-runner to root: successful
May 20 15:19:48 desk smartd[550]: Device: /dev/sda [SAT], state written to /var/lib/smartmontools/smartd.CT4000MX500SSD1-2304E6A3D318.ata.state
May 20 15:19:48 desk smartd[550]: Device: /dev/nvme0, state written to /var/lib/smartmontools/smartd.KBG30ZMV256G_TOSHIBA-X8OPD1PGP12P.nvme.state
May 20 15:49:48 desk smartd[550]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 73 to 74
May 20 22:49:48 desk smartd[550]: Device: /dev/nvme0, number of Error Log entries increased from 1758 to 1760

当我运行时sudo smartctl -i -a /dev/nvme0,它会显示错误计数,但我不知道如何查看与增加计数相关的日志消息:

smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.1.0-9-amd64] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       KBG30ZMV256G TOSHIBA
Serial Number:                      X8OPD1PGP12P
Firmware Version:                   ADHA0101
PCI Vendor/Subsystem ID:            0x1179
IEEE OUI Identifier:                0x00080d
Controller ID:                      0
NVMe Version:                       1.2.1
Number of Namespaces:               1
Namespace 1 Size/Capacity:          256,060,514,304 [256 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            00080d 04004ad9aa
Local Time is:                      Sat May 20 23:09:32 2023 EDT
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0017):     Comp Wr_Unc DS_Mngmt Sav/Sel_Feat
Log Page Attributes (0x02):         Cmd_Eff_Lg
Maximum Data Transfer Size:         512 Pages
Warning  Comp. Temp. Threshold:     82 Celsius
Critical Comp. Temp. Threshold:     85 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     3.30W       -        -    0  0  0  0        0       0
 1 +     2.70W       -        -    1  1  1  1        0       0
 2 +     2.30W       -        -    2  2  2  2        0       0
 3 -   0.0500W       -        -    4  4  4  4     8000   32000
 4 -   0.0050W       -        -    4  4  4  4     8000   40000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 -    4096       0         0
 1 +     512       0         3

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        32 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    30%
Data Units Read:                    23,188,612 [11.8 TB]
Data Units Written:                 39,727,036 [20.3 TB]
Host Read Commands:                 222,771,983
Host Write Commands:                498,052,687
Controller Busy Time:               7,440
Power Cycles:                       291
Power On Hours:                     20,378
Unsafe Shutdowns:                   615
Media and Data Integrity Errors:    0
Error Information Log Entries:      1,760
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               32 Celsius

Error Information (NVMe Log 0x01, 16 of 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0       1760     0  0x501a  0xc005  0x028            -     1     -
  1       1759     0  0xb012  0xc005  0x028            -     1     -
  2       1758     0  0x5010  0xc005  0x028            -     0     -

我怎样才能找出错误是什么?

答案1

尝试安装nvme-cli封装有

apt-get install nvme-cli

然后使用检索错误

nvme error-log /dev/nvme0

相关内容