Ext4 变得疯狂

Ext4 变得疯狂

我的所有主硬盘巴拉库达 7200.12 SATA 6Gb/s 250GB ST3250312AS( 2 岁 ) 变得狂野。从 2014 年 6 月 28 日开始。

分区 sda1(root)、sda5(home)、sda7(var) 有时会在随机文件上出现这种情况。有时是我的主分区,当我通过浏览器下载某些内容时。有时是 /var,我的 .rrd 就在此:

[82845.340334] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[82845.340346] ata1.00: irq_stat 0x40000001
[82845.340347] ata1.00: failed command: READ DMA
[82845.340351] ata1.00: cmd c8/00:40:3e:29:0f/00:00:00:00:00/ec tag 0 dma 32768 in
[82845.340351]          res 51/40:00:3e:29:0f/00:00:0c:00:00/0c Emask 0x9 (media error)
[82845.340352] ata1.00: status: { DRDY ERR }
[82845.340353] ata1.00: error: { UNC }
[82845.417764] ata1.00: configured for UDMA/133
[82845.417775] sd 0:0:0:0: [sda] Unhandled sense code
[82845.417777] sd 0:0:0:0: [sda]  
[82845.417778] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[82845.417779] sd 0:0:0:0: [sda]  
[82845.417780] Sense Key : Medium Error [current] [descriptor]
[82845.417782] Descriptor sense data with sense descriptors (in hex):
[82845.417783]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 
[82845.417789]         0c 0f 29 3e 
[82845.417791] sd 0:0:0:0: [sda]  
[82845.417792] Add. Sense: Unrecovered read error - auto reallocate failed
[82845.417793] sd 0:0:0:0: [sda] CDB: 
[82845.417794] Read(10): 28 00 0c 0f 29 3e 00 00 40 00
[82845.417799] end_request: I/O error, dev sda, sector 202320190
[82845.417809] ata1: EH complete
[82848.253943] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[82848.253946] ata1.00: irq_stat 0x40000001
[82848.253955] ata1.00: failed command: READ DMA
[82848.253958] ata1.00: cmd c8/00:08:3e:29:0f/00:00:00:00:00/ec tag 0 dma 4096 in
[82848.253958]          res 51/40:00:3e:29:0f/00:00:0c:00:00/0c Emask 0x9 (media error)
[82848.253960] ata1.00: status: { DRDY ERR }
[82848.253961] ata1.00: error: { UNC }
[82848.264595] ata1.00: configured for UDMA/133
[82848.264610] sd 0:0:0:0: [sda] Unhandled sense code
[82848.264611] sd 0:0:0:0: [sda]  
[82848.264612] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[82848.264614] sd 0:0:0:0: [sda]  
[82848.264615] Sense Key : Medium Error [current] [descriptor]
[82848.264617] Descriptor sense data with sense descriptors (in hex):
[82848.264618]         72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 
[82848.264623]         0c 0f 29 3e 
[82848.264626] sd 0:0:0:0: [sda]  
[82848.264627] Add. Sense: Unrecovered read error - auto reallocate failed
[82848.264628] sd 0:0:0:0: [sda] CDB: 
[82848.264629] Read(10): 28 00 0c 0f 29 3e 00 00 08 00
[82848.264634] end_request: I/O error, dev sda, sector 202320190
[82848.264648] ata1: EH complete

我已经备份完了,不怕。

我使用的是最新的 Ubuntu 和最新的内核。

Linux Ubuntu 3.13.0-30-lowlatency #54-Ubuntu SMP PREEMPT Mon Jun 9 23:14:29 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

Google 在许多不同的论坛上为我找到许多类似的问题和完美的答案。

我想弄清楚,是硬件故障,还是这个修补(30 小时前) 到 Linux 内核。

有什么建议么?

编辑1:我找到了完美的工具:

# sdparm --page=rw --long --long --long /dev/sda
    /dev/sda: ATA       ST3250312AS       
    Direct access device specific parameters: WP=0  DPOFUA=0
Read write error recovery [rw] mode page [PS=0]:
  AWRE        1  [cha: n, def:  1]  Automatic write reallocation enabled
  ARRE        0  [cha: n, def:  0]  Automatic read reallocation enabled
  TB          0  [cha: n, def:  0]  Transfer block
  RC          0  [cha: n, def:  0]  Read continuous
        0: error recovery may cause delays
        1: transfer data without waiting for error recovery
  EER         0  [cha: n, def:  0]  Enable early recovery
        1: increase chance of mis-detection or mis-correction of error
  PER         0  [cha: n, def:  0]  Post error
        0: do not post recovered errors
        1: report recovered errors (via sense key: recovered error)
  DTE         0  [cha: n, def:  0]  Data terminate on error
        1: terminate data transfer when recovered error detected
  DCR         0  [cha: n, def:  0]  Disable correction
  RRC         0  [cha: n, def:  0]  Read retry count
  COR_S       0  [cha: n, def:  0]  Correction span (obsolete)
  HOC         0  [cha: n, def:  0]  Head offset count (obsolete)
  DSOC        0  [cha: n, def:  0]  Data strobe offset count (obsolete)
  LBPERE      0  [cha: n, def:  0]  Logical block provisioning error reporting enabled
  WRC         0  [cha: n, def:  0]  Write retry count
  RTL         0  [cha: n, def:  0]  Recovery time limit (ms)
        0: default, -1: 65.5 seconds

編輯2:smartctl -a /dev/sda-> pastebin.com/VsFEbAAQ

编辑3:为什么我希望这不是硬件,而是 Ext4:因为,请阅读此主题 bbs.archlinux.org/viewtopic.php?id=151341

Ted Ts'o,与上次补丁的提交者是同一个人,补丁发布时间大约在 2012-10-24。

答案1

在我看来,这看起来像是一个简单的硬件故障。

该内核补丁未在您的系统上运行,因为您的内核是在近一个月前构建的。无论如何,Ubuntu 引入未在已发布内核中出现的内核补丁都是非常极端的。

如果我看到这些消息我就会更换磁盘。

相关内容