(请注意,当我返回服务器时,我将使用 zpool status 进行修改以便更清楚)
我有一个 6x2TB 磁盘 raidz2 ZFS 池,托管在 Solaris 11 中
我两年前构建此服务器时安装的原始 M1015(刷新至 IT 模式)发生故障,无法再在 PCI-E 总线上识别。我昨天用另一个 M1015(刷新至 IT 模式)替换了它,Solaris 再次找到了所有磁盘。
但是,ZFS 池被置于 SUSPENDED 模式(可能是因为之前的 M1015 在运行时死亡并且之前所有的磁盘都消失了),我看到 2 个磁盘上正在重新镀银(??),所有磁盘都被列为不可用。
我毫不怀疑池中存在一些错误。但我已经清除了(fmadm 已修复,zpool 已清除)故障,希望池可以在降级状态下重新挂载。然而,重新启动后,池首先显示为降级(一些磁盘显示为不可用,一些显示为降级),然后立即转换为暂停,所有磁盘都显示为不可用,并开始重新同步。
重新同步速度从大约 100MBps 开始,然后迅速下降到 50kbps 或更低。这相当于预计的重新同步时间有几百个小时。更重要的是,iostat 显示池中的任何磁盘上都没有发生任何事务。似乎所有事务都在重启后不久突然发生,因为我可以看到在连续重启后大约 +1GB 的扫描进度。
当池处于挂起状态时,我无法脱机任何磁盘或导出池(此外,当“fmadm 故障”条目全部报告为已修复时,我也不确定为什么它会进入挂起状态)
更换 SAS 控制器时我哪里出了问题?如何恢复?
$ zpool status
pool: rpool
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
c8t0d0s1 ONLINE 0 0 0
errors: No known data errors
pool: tank
state: SUSPENDED
status: One or more devices is currently being resilvered. The pool will
continue to function in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Thu Apr 16 19:37:54 2015
14.7G scanned out of 8.71T at 127K/s, (scan is slow, no estimated time)
1.23G resilvered, 0.17% done
config:
NAME STATE READ WRITE CKSUM
tank UNAVAIL 0 0 0
raidz2-0 UNAVAIL 0 0 0
c0t5000C5005E169C55d0 UNAVAIL 0 0 0
c0t5000C5005C08BE07d0 UNAVAIL 0 0 0
c0t5000C5005C07780Ad0 UNAVAIL 0 0 0
c0t5000C5005E21AE92d0 UNAVAIL 0 0 0 (resilvering)
c0t5000C5005E0C5056d0 UNAVAIL 0 0 0
c0t5000C5005C04F982d0 UNAVAIL 0 0 0 (resilvering)
device details:
c0t5000C5005E169C55d0 UNAVAIL experienced I/O failures
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
c0t5000C5005C08BE07d0 UNAVAIL experienced I/O failures
status: FMA has degraded this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
see: http://support.oracle.com/msg/ZFS-8000-GH for recovery
c0t5000C5005C07780Ad0 UNAVAIL experienced I/O failures
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
c0t5000C5005E21AE92d0 UNAVAIL experienced I/O failures
status: FMA has degraded this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
c0t5000C5005E0C5056d0 UNAVAIL experienced I/O failures
status: FMA has faulted this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
c0t5000C5005C04F982d0 UNAVAIL experienced I/O failures
status: FMA has degraded this device.
action: Run 'fmadm faulty' for more information. Clear the errors
using 'fmadm repaired'.
see: http://support.oracle.com/msg/ZFS-8000-LR for recovery
$ iostat -en
---- errors ---
s/w h/w trn tot device
0 0 0 0 c8t0d0
0 11 0 11 c7t0d0
0 0 0 0 c0t5000C5005E0C5056d0
0 0 0 0 c0t5000C5005E169C55d0
0 0 0 0 c0t5000C5005C08BE07d0
0 0 0 0 c0t5000C5005E21AE92d0
0 0 0 0 c0t5000C5005C07780Ad0
0 0 0 0 c0t5000C5005C04F982d0