我在 Ubuntu 18.04.2 服务器上使用具有三个磁盘的 RAID5 BTRFS 系统。
其中一个驱动器开始出错,文件系统不断重新挂载为只读。我试过了,btrfs scrub
但没有成功——文件系统重新挂载为只读。试过了zero-log
所以我想更换故障驱动器。我读到可以换btrfs add
一个新的驱动器,然后从阵列中删除另一个,而不是使用btrfs replace
,这显然存在一些问题。
所以我尝试了,但出现了错误。每次我尝试运行btrfs device delete <device>
文件系统时,都会重新安装为 RO。当我尝试移除驱动器时,甚至没有断开连接,也会出现同样的情况btrfs device delete missing
。没有变化。RO。
-o usebackuproot
我确实使用和安装了这个系统-o degraded
。
我运行了块恢复,但系统关闭了。我试了好几次,因为我现在再也无法安装阵列了。不过,在不运行故障驱动器的情况下,我确实完成了一次块恢复。但块恢复失败了,现在我根本无法安装 RAID。
安装时不是输出状态can't read superblock on <device>
,而是btrfs rescue super-recover
输出All supers are valid, no need to recover
。
尝试挂载时没有 dmesg 输出状态。
[54605.499604] BTRFS info (device dm-7): bdev /dev/mapper/luks-* errs: wr 29664, rd 30074, flush 32, corrupt 0, gen 31
[54605.525654] BTRFS error (device dm-7): parent transid verify failed on 38977536 wanted 82072 found 82114
[54605.526827] BTRFS error (device dm-7): parent transid verify failed on 38977536 wanted 82072 found 82114
[54605.526847] BTRFS warning (device dm-7): failed to read fs tree: -5
[54605.553948] BTRFS error (device dm-7): open_ctree failed
使用以下命令再次连接故障驱动器后,恢复即可:
sudo btrfs restore -u 2 -vvv -ixm /dev/mapper/luks-* /mnt/pnt
我有一个特定目录的快照,这是最重要的 - 在发生故障的 RAID 阵列上。我还将所有内容备份到另一驱动器(我认为)。
我几乎要开始做了btrfs check --repair
。这是我进行一些试运行的时候:
$ sudo btrfs check /dev/mapper/luks-*
parent transid verify failed on 38813696 wanted 82115 found 82116
parent transid verify failed on 38813696 wanted 82115 found 82116
checksum verify failed on 38813696 found 4E67B99A wanted AA84042C
parent transid verify failed on 38813696 wanted 82115 found 82116
Ignoring transid failure
Checking filesystem on /dev/mapper/luks-*
UUID: *
Error: could not find extent items for root 258
ERROR: failed to repair root items: No such file or directory
$ sudo btrfs check --check-data-csum /dev/mapper/luks-*
parent transid verify failed on 38813696 wanted 82115 found 82116
parent transid verify failed on 38813696 wanted 82115 found 82116
checksum verify failed on 38813696 found 4E67B99A wanted AA84042C
parent transid verify failed on 38813696 wanted 82115 found 82116
Ignoring transid failure
Checking filesystem on /dev/mapper/luks-*
UUID: *
Error: could not find extent items for root 258
ERROR: failed to repair root items: No such file or directory
$ sudo btrfs check --init-extent-tree /dev/mapper/luks-*
Checking filesystem on /dev/mapper/luks-*
UUID: *
Creating a new extent tree
Failed to find [38879232, 168, 16384]
btrfs unable to find ref byte nr 38895616 parent 0 root 1 owner 1 offset 0
Failed to find [38879232, 168, 16384]
btrfs unable to find ref byte nr 38912000 parent 0 root 1 owner 0 offset 1
parent transid verify failed on 38944768 wanted 82115 found 82116
Ignoring transid failure
checking extents
parent transid verify failed on 38977536 wanted 82072 found 82114
Ignoring transid failure
leaf parent key incorrect 38977536
bad block 38977536
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
root 5 missing its root dir, recreating
Failed to find [39043072, 168, 16384]
btrfs unable to find ref byte nr 1792131072 parent 0 root 4 owner 1 offset 0
Failed to find [39043072, 168, 16384]
btrfs unable to find ref byte nr 39878656 parent 0 root 4 owner 0 offset 1
Failed to find [22020096, 168, 16384]
btrfs unable to find ref byte nr 22036480 parent 0 root 3 owner 0 offset 1
leaf free space ret -21995796, leaf data size 16283, used 22012079 nritems 50
leaf free space ret -21995796, leaf data size 16283, used 22012079 nritems 50
leaf free space incorrect 22020096 -21995796
extent-tree.c:1915: do_chunk_alloc: BUG_ON `ret` triggered, value -1
btrfs(+0x1f1e5)[0x563d259ea1e5]
btrfs(+0x1f255)[0x563d259ea255]
btrfs(+0x1f268)[0x563d259ea268]
btrfs(+0x22cea)[0x563d259edcea]
btrfs(btrfs_reserve_extent+0xf9)[0x563d259ede44]
btrfs(btrfs_alloc_free_block+0x5e)[0x563d259ee5df]
btrfs(__btrfs_cow_block+0xfe)[0x563d259e2c3c]
btrfs(btrfs_cow_block+0xc5)[0x563d259e31e1]
btrfs(btrfs_search_slot+0xfa)[0x563d259e5095]
btrfs(btrfs_insert_empty_items+0x82)[0x563d259e62cf]
btrfs(btrfs_insert_item+0x64)[0x563d259e6600]
btrfs(btrfs_insert_inode+0x37)[0x563d259f3c89]
btrfs(btrfs_make_root_dir+0xb4)[0x563d259f9de3]
btrfs(+0x15c39)[0x563d259e0c39]
btrfs(cmd_check+0x19fb)[0x563d25a1efe2]
btrfs(main+0x143)[0x563d259e1c87]
Aborted
有什么想法可以做什么吗?我现在将尝试另一个块恢复,连接故障驱动器,看看这次服务器是否处于活动状态。如果需要,我还有块恢复日志。