我在 3 个磁盘上设置了 Btrfs,元数据和数据位于 RAID1 中。但现在我有一个校验和错误,无法恢复。
两个副本上的校验和相同,仅与预期校验和相差一位翻转位。因此我怀疑在将校验和写入磁盘之前存在位翻转(计算机没有 ECC RAM)。在将实际文件写入此文件系统之前,我在另一台计算机上有一份实际文件的副本,但如下所示,由于文件系统中的 I/O 错误,我无法读出数据,因此我无法比较它们。
我应该如何继续修复此错误?
一些细节:
$ uname -a
Linux stan 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ btrfs --version
btrfs-progs v4.15.1
$ sudo btrfs fi usage /media/btrfs/
Overall:
Device size: 7.28TiB
Device allocated: 3.91TiB
Device unallocated: 3.36TiB
Device missing: 0.00B
Used: 3.83TiB
Free (estimated): 1.72TiB (min: 1.72TiB)
Data ratio: 2.00
Metadata ratio: 2.00
Global reserve: 512.00MiB (used: 0.00B)
Data,RAID1: Size:1.95TiB, Used:1.91TiB
/dev/sdb 1.95TiB
/dev/sdc 998.00GiB
/dev/sdd 1001.00GiB
Metadata,RAID1: Size:4.00GiB, Used:2.63GiB
/dev/sdb 4.00GiB
/dev/sdc 3.00GiB
/dev/sdd 1.00GiB
System,RAID1: Size:64.00MiB, Used:304.00KiB
/dev/sdb 64.00MiB
/dev/sdc 64.00MiB
Unallocated:
/dev/sdb 1.68TiB
/dev/sdc 861.95GiB
/dev/sdd 861.02GiB
擦洗:
$ sudo btrfs scrub status /media/btrfs/
scrub status for xxxxxx
scrub started at Mon Aug 24 11:23:27 2020 and finished after 03:41:54
total bytes scrubbed: 3.81TiB with 2 errors
error details: csum=2
corrected errors: 0, uncorrectable errors: 2, unverified errors: 0
擦洗后出现 Dmesg 错误。
$ dmesg
...
196755.786038] BTRFS warning (device sdb): checksum error at logical 3099310968832 on dev /dev/sdb, physical 1300730499072, root 5223, inod
e 6521311, offset 7614464, length 4096, links 1 (path: users/joachim/Bilder/Canon/270CANON/IMG_7003.CR2)
[196755.786168] BTRFS warning (device sdb): checksum error at logical 3099310968832 on dev /dev/sdb, physical 1300730499072, root 5303, inod
e 6521311, offset 7614464, length 4096, links 1 (path: users/joachim/Bilder/Canon/270CANON/IMG_7003.CR2)
[196755.786245] BTRFS warning (device sdb): checksum error at logical 3099310968832 on dev /dev/sdb, physical 1300730499072, root 5302, inod
e 6521311, offset 7614464, length 4096, links 1 (path: users/joachim/Bilder/Canon/270CANON/IMG_7003.CR2)
...
[196755.788274] BTRFS error (device sdb): bdev /dev/sdb errs: wr 0, rd 0, flush 0, corrupt 2, gen 0
[196755.814044] BTRFS error (device sdb): unable to fixup (regular) error at logical 3099310968832 on dev /dev/sdb
检查块内部:
$ sudo btrfs inspect-internal logical-resolve -v 3099310968832 /media/btrfs/
ioctl ret=0, total_size=4096, bytes_left=3456, bytes_missing=0, cnt=78, missed=0
ioctl ret=0, bytes_left=4023, bytes_missing=0, cnt=1, missed=0
/media/btrfs//snapshots/stansafe.20200601T032501+0200/users/joachim/Bilder/Canon/270CANON/IMG_7003.CR2
ioctl ret=0, bytes_left=4023, bytes_missing=0, cnt=1, missed=0
/media/btrfs//snapshots/stansafe.20200910T032501+0200/users/joachim/Bilder/Canon/270CANON/IMG_7003.CR2
ioctl ret=0, bytes_left=4023, bytes_missing=0, cnt=1, missed=0
/media/btrfs//snapshots/stansafe.20200909T032502+0200/users/joachim/Bilder/Canon/270CANON/IMG_7003.CR2
...
尝试验证文件:
$ sha256sum /media/btrfs//stansafe/users/joachim/Bilder/Canon/270CANON/IMG_7003.CR2
sha256sum: /media/btrfs//stansafe/users/joachim/Bilder/Canon/270CANON/IMG_7003.CR2: Input/output error
$ dmesg
...
[1642985.509498] BTRFS warning (device sdb): csum failed root 259 ino 6521311 off 7614464 csum 0x151ad4ce expected csum 0x150ad4ce mirror 1
[1642985.509942] BTRFS warning (device sdb): csum failed root 259 ino 6521311 off 7614464 csum 0x151ad4ce expected csum 0x150ad4ce mirror 2
答案1
Linux Kernel 5.11 引入了挂载和忽略文件校验和的选项。这将允许复制出错误 csum 的数据。
mount -o rescue=ignoredatacsums /dev/sdX /mnt
如果文件损坏程度最低并且您有某种奇偶校验,这可以让您完全恢复文件,例如2杆
答案2
如果您想将所有文件的 3 个副本存储在 3 个硬盘或 3 个分区上
使用raid1c3
这样,即使您丢失了 2 个硬盘,您在 1 个硬盘中的数据仍然存在。
raid1 只存储 2 个副本,无论您使用多少个硬盘