问题

问题

我在两个磁盘上运行 LVM RAID 1。以下是lvs有关我的 VG 的信息:

root@picard:~# lvs -a -o +devices,lv_health_status,raid_sync_action,raid_mismatch_count 
  /run/lvm/lvmetad.socket: connect failed: No such file or directory
  WARNING: Failed to connect to lvmetad. Falling back to internal scanning.
  LV                 VG      Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                                 Health          SyncAction Mismatches
  lv-data            vg-data rwi-aor-r- 2.70t                                    100.00           lv-data_rimage_0(0),lv-data_rimage_1(0) refresh needed  idle                0
  [lv-data_rimage_0] vg-data iwi-aor-r- 2.70t                                                     /dev/sda(0)                             refresh needed                       
  [lv-data_rimage_1] vg-data iwi-aor--- 2.70t                                                     /dev/sdb(1)                                                                  
  [lv-data_rmeta_0]  vg-data ewi-aor-r- 4.00m                                                     /dev/sda(708235)                        refresh needed                       
  [lv-data_rmeta_1]  vg-data ewi-aor--- 4.00m                                                     /dev/sdb(0)     

看起来好像出了点问题/dev/sda。该磁盘的 SMART 日志看起来不错,所以我希望这只是暂时的,我想刷新/重新同步我的 RAID。以下是我所做的:

root@picard:~# lvchange --refresh vg-data/lv-data
  /run/lvm/lvmetad.socket: connect failed: No such file or directory
  WARNING: Failed to connect to lvmetad. Falling back to internal scanning.

(…wait for a couple of minutes…)

root@picard:~# lvs -a -o +devices,lv_health_status,raid_sync_action,raid_mismatch_count
  /run/lvm/lvmetad.socket: connect failed: No such file or directory
  WARNING: Failed to connect to lvmetad. Falling back to internal scanning.
  LV                 VG      Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                                 Health          SyncAction Mismatches
  lv-data            vg-data rwi-aor-r- 2.70t                                    100.00           lv-data_rimage_0(0),lv-data_rimage_1(0) refresh needed  idle                0
  [lv-data_rimage_0] vg-data iwi-aor-r- 2.70t                                                     /dev/sda(0)                             refresh needed                       
  [lv-data_rimage_1] vg-data iwi-aor--- 2.70t                                                     /dev/sdb(1)                                                                  
  [lv-data_rmeta_0]  vg-data ewi-aor-r- 4.00m                                                     /dev/sda(708235)                        refresh needed                       
  [lv-data_rmeta_1]  vg-data ewi-aor--- 4.00m                                                     /dev/sdb(0)               

那么,这什么也没做吗?我的 dmesg 表明它试图恢复 RAID:

[150522.459416] device-mapper: raid: Faulty raid1 device #0 has readable super block.  Attempting to revive it.

好吧,也许擦洗有用?让我们试试看:

root@picard:~# lvchange --syncaction repair vg-data/lv-data
  /run/lvm/lvmetad.socket: connect failed: No such file or directory
  WARNING: Failed to connect to lvmetad. Falling back to internal scanning.
root@picard:~# lvs -a -o +devices,lv_health_status,raid_sync_action,raid_mismatch_count
  /run/lvm/lvmetad.socket: connect failed: No such file or directory
  WARNING: Failed to connect to lvmetad. Falling back to internal scanning.
  LV                 VG      Attr       LSize Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert Devices                                 Health          SyncAction Mismatches
  lv-data            vg-data rwi-aor-r- 2.70t                                    100.00           lv-data_rimage_0(0),lv-data_rimage_1(0) refresh needed  idle                0
  [lv-data_rimage_0] vg-data iwi-aor-r- 2.70t                                                     /dev/sda(0)                             refresh needed                       
  [lv-data_rimage_1] vg-data iwi-aor--- 2.70t                                                     /dev/sdb(1)                                                                  
  [lv-data_rmeta_0]  vg-data ewi-aor-r- 4.00m                                                     /dev/sda(708235)                        refresh needed                       
  [lv-data_rmeta_1]  vg-data ewi-aor--- 4.00m                                                     /dev/sdb(0)            

这里面,有几件很奇怪的事情:

  • SyncActionidle,也就是说,看起来擦洗立即就完成了?
  • 如果擦洗已完成,阵列仍需要刷新,不匹配计数怎么可能仍为 0?清理不应该检测不匹配并纠正它们(即清除“需要刷新”状态)或将不匹配计数提升为非零吗?

dmesg 说:

[150695.091180] md: requested-resync of RAID array mdX
[150695.092285] md: mdX: requested-resync done.

看起来,擦洗实际上并没有起到任何作用。

问题

  • 我如何调用实际的清理?
  • 假设驱动器没有故障——我该如何刷新阵列?
  • 如果驱动器出现故障(即刷新立即出错) - 我该如何查看?我假设 dmesg 应该显示一些 I/O 错误?(我没有看到任何错误……)

系统信息

我正在运行基于 Ubuntu 16.04.4 LTS 的 Armbian。LVM 版本:

root@picard:~# lvm version
  LVM version:     2.02.133(2) (2015-10-30)
  Library version: 1.02.110 (2015-10-30)
  Driver version:  4.37.0

相关内容