我想通过发出以下命令来替换 zpool 中的磁盘:
zpool replace -o ashift=12 pool /dev/mapper/transport /dev/mapper/data2
ZFS 开始工作并重新同步池。在此过程中,旧磁盘上出现了一些读取错误,完成后zpool status -v
如下所示:
pool: pool
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: http://zfsonlinux.org/msg/ZFS-8000-8A
scan: resilvered 6,30T in 147h38m with 6929 errors on Sat Feb 11 13:31:05 2017
config:
NAME STATE READ WRITE CKSUM
pool ONLINE 0 0 16,0K
raidz1-0 ONLINE 0 0 32,0K
data1 ONLINE 0 0 0
replacing-1 ONLINE 0 0 0
transport ONLINE 14,5K 0 0
data2 ONLINE 0 0 0
data3 ONLINE 0 0 0
logs
data-slog ONLINE 0 0 0
cache
data-cache ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
<list of 3 files>
我期望旧磁盘从池中分离出来,但事实并非如此。我尝试手动将其分离:
# zpool detach pool /dev/mapper/transport
cannot detach /dev/mapper/transport: no valid replicas
但当我导出池、移除旧驱动器并再次导入池时,它似乎运行正常:它开始重新同步再次,但它是“降级”,而不是“失败”:
pool: pool
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Sat Feb 11 17:28:50 2017
42,7G scanned out of 9,94T at 104M/s, 27h43m to go
1,68G resilvered, 0,42% done
config:
NAME STATE READ WRITE CKSUM
pool DEGRADED 0 0 9
raidz1-0 DEGRADED 0 0 18
data1 ONLINE 0 0 0
replacing-1 DEGRADED 0 0 0
15119075650261564517 UNAVAIL 0 0 0 was /dev/mapper/transport
data2 ONLINE 0 0 0 (resilvering)
data3 ONLINE 0 0 0 (resilvering)
logs
data-slog ONLINE 0 0 0
cache
data-cache ONLINE 0 0 0
errors: Permanent errors have been detected in the following files:
<list of 3 files>
尽管对于池的全部功能来说这显然不是必需的,但我无法分离旧驱动器:
# zpool offline pool 15119075650261564517
cannot offline 15119075650261564517: no valid replicas
到底是怎么回事?
更新:显然,ZoL 还没有放弃故障设备。替换了 3 个有永久错误的文件(其中一个是 zvol,这意味着我必须创建另一个文件来dd conv=noerror
覆盖内容并销毁旧文件),然后让重新镀银完成,最终删除了旧驱动器。
我仍然对 ZoL 的想法感兴趣。我的意思是,所有不会导致读取或校验和错误的内容都被复制到了新设备,并且它已经将导致错误的扇区标记为永久故障。那么为什么要保留 ZoL 显然不打算再从中获取任何信息的旧设备呢?
答案1
这里的情况类似,部分解决方案仅供参考:
- 关闭电源
- 物理断开所有其他涉及 zfs 的磁盘,只保留目标磁盘的连接
- 打开
- 在目标磁盘上创建一个临时 zpool
- 关闭电源
- 重新连接所有磁盘
- 打开
- 导入旧的 zpool,现在目标磁盘不会被卷入,因为它属于另一个 zpool
- 或者先将临时 zpool 与目标磁盘一起导入