我可以在不重新启动的情况下更换 btrfs raid1 中的磁盘吗?

我可以在不重新启动的情况下更换 btrfs raid1 中的磁盘吗?

我试图了解当你取出一个磁盘并将其放回时 btrfs raid1 模式的行为(odroid hc4 是一个示例设备)

以下是我测试之前的情况:

Label: none  uuid: f85fb0ab-e643-4266-9f61-0f6e4980b871
    Total devices 2 FS bytes used 2.35GiB
    devid    1 size 111.79GiB used 9.03GiB path /dev/sdb
    devid    2 size 1.82TiB used 9.03GiB path /dev/sda

然后我分离 /dev/sda 磁盘,dmesg 显示:

[158763.932162] ata1: SATA link down (SStatus 0 SControl 300)
[158769.474056] ata1: SATA link down (SStatus 0 SControl 300)
[158774.849753] ata1: SATA link down (SStatus 0 SControl 300)
[158774.849775] ata1.00: disabled
[158774.849814] ata1.00: detaching (SCSI 0:0:0:0)
[158774.851067] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[158774.851161] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=0x04 driverbyte=0x00
[158774.851168] sd 0:0:0:0: [sda] Stopping disk
[158774.851190] sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=0x04 driverbyte=0x00
[158784.777808] BTRFS error (device sdb): bdev /dev/sda errs: wr 9, rd 0, flush 1, corrupt 0, gen 0
[158784.778204] BTRFS error (device sdb): bdev /dev/sda errs: wr 10, rd 0, flush 1, corrupt 0, gen 0
[158784.778716] BTRFS error (device sdb): bdev /dev/sda errs: wr 11, rd 0, flush 1, corrupt 0, gen 0
[158784.779232] BTRFS error (device sdb): bdev /dev/sda errs: wr 12, rd 0, flush 1, corrupt 0, gen 0
[158784.779505] BTRFS error (device sdb): bdev /dev/sda errs: wr 13, rd 0, flush 1, corrupt 0, gen 0
[158784.782423] BTRFS error (device sdb): bdev /dev/sda errs: wr 13, rd 0, flush 2, corrupt 0, gen 0
[158784.782651] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158784.782660] BTRFS error (device sdb): bdev /dev/sda errs: wr 14, rd 0, flush 2, corrupt 0, gen 0
[158784.782762] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158784.782767] BTRFS error (device sdb): bdev /dev/sda errs: wr 15, rd 0, flush 2, corrupt 0, gen 0
[158784.782864] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158784.782869] BTRFS error (device sdb): bdev /dev/sda errs: wr 16, rd 0, flush 2, corrupt 0, gen 0
[158784.784112] BTRFS error (device sdb): error writing primary super block to device 2
[158788.810744] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[158788.814057] ata1.00: ATA-10: CT2000BX500SSD1, M6CR030, max UDMA/133
[158788.814064] ata1.00: 3907029168 sectors, multi 1: LBA48 NCQ (depth 32), AA
[158788.824662] ata1.00: configured for UDMA/133
[158788.824934] scsi 0:0:0:0: Direct-Access     ATA      CT2000BX500SSD1  030  PQ: 0 ANSI: 5
[158788.825550] sd 0:0:0:0: Attached scsi generic sg0 type 0
[158788.825726] sd 0:0:0:0: [sdc] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[158788.825760] sd 0:0:0:0: [sdc] Write Protect is off
[158788.825766] sd 0:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[158788.825812] sd 0:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[158788.863902] sd 0:0:0:0: [sdc] Attached SCSI disk
[158788.884759] BTRFS warning: duplicate device /dev/sdc devid 2 generation 48028 scanned by systemd-udevd (5758)
[158789.894806] BTRFS error (device sdb): bdev /dev/sda errs: wr 17, rd 0, flush 2, corrupt 0, gen 0
[158820.615643] BTRFS error (device sdb): bdev /dev/sda errs: wr 18, rd 0, flush 2, corrupt 0, gen 0
[158820.616144] BTRFS error (device sdb): bdev /dev/sda errs: wr 19, rd 0, flush 2, corrupt 0, gen 0
[158820.616585] BTRFS error (device sdb): bdev /dev/sda errs: wr 20, rd 0, flush 2, corrupt 0, gen 0
[158820.617080] BTRFS error (device sdb): bdev /dev/sda errs: wr 21, rd 0, flush 2, corrupt 0, gen 0
[158820.617353] BTRFS error (device sdb): bdev /dev/sda errs: wr 22, rd 0, flush 2, corrupt 0, gen 0
[158820.617558] BTRFS error (device sdb): bdev /dev/sda errs: wr 23, rd 0, flush 2, corrupt 0, gen 0
[158820.620575] BTRFS error (device sdb): bdev /dev/sda errs: wr 23, rd 0, flush 3, corrupt 0, gen 0
[158820.620791] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158820.620799] BTRFS error (device sdb): bdev /dev/sda errs: wr 24, rd 0, flush 3, corrupt 0, gen 0
[158820.620896] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158820.620901] BTRFS error (device sdb): bdev /dev/sda errs: wr 25, rd 0, flush 3, corrupt 0, gen 0
[158820.621128] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158820.621237] BTRFS error (device sdb): bdev /dev/sda errs: wr 26, rd 0, flush 3, corrupt 0, gen 0
[158820.622271] BTRFS error (device sdb): error writing primary super block to device 2
[158830.852513] BTRFS error (device sdb): bdev /dev/sda errs: wr 27, rd 0, flush 3, corrupt 0, gen 0

然后我将其重新连接,这次它显示为 /dev/sdc。

如果我尝试替换它,我会收到各种错误:

# btrfs replace start 2 /dev/sdc /mnt
/dev/sdc appears to contain an existing filesystem (btrfs).
ERROR: use the -f option to force overwrite of /dev/sdc
# btrfs replace start 2 /dev/sdc /mnt -f
ERROR: /dev/sdc is mounted

但如果我重新启动设备,一切都会恢复正常。

答案1

我无法回答“为什么”,但我认为这与 btrfs 仍然知道您正在更换的是同一台设备但拒绝这样做有关。

我能解决这个问题的唯一方法是在更换之前从驱动器中擦除 fs。在热插拔时,我也无法让驱动器正确地重新枚举,因此必须重新启动才能让驱动器枚举到我想要的位置。

$ sudo btrfs replace start -f 1 /dev/sdc/mnt/tmpbtrfs
ERROR: /dev/sdc is mounted
$ sudo wipefs -a /dev/sdc
/dev/sdc: 8 bytes were erased at offset 0x00010040 (btrfs): 5f 42 48 52 66 53 5f 4d
$ sudo btrfs replace start -f 1 /dev/sdc /mnt/tmpbtrfs
Started on 12.Aug 10:31:24, finished on 12.Aug 10:47:27, 0 write errs, 0 uncorr. read errs

由于我很难找到有关如何更换完好无损的 btrfs raid1 磁盘的信息,所以我想我应该把它写下来。

以下是我在 btrfs raid1 阵列中对功能正常的驱动器进行就地升级时执行的步骤。

  1. 平衡阵列以校正轮廓
  2. 在 /etc/fstab 中添加,noautobtrfs 行
  3. 关闭
  4. 移除旧驱动器并插入新驱动器
  5. 临时位置中的安装阵列已降级
  6. 再三检查你的新驱动器被枚举到哪里
  7. 在坏掉的“插槽”中用新驱动器执行 btrfs 替换(btrfs 缺少设备号)
  8. 将阵列重新平衡为 raid1(元数据和数据)
  9. 重启

相关内容