ZFS(Solaris 11)- SAS 控制器损坏并被更换 - 池已暂停

ZFS(Solaris 11)- SAS 控制器损坏并被更换 - 池已暂停

(请注意,当我返回服务器时,我将使用 zpool status 进行修改以便更清楚)

我有一个 6x2TB 磁盘 raidz2 ZFS 池,托管在 Solaris 11 中

我两年前构建此服务器时安装的原始 M1015(刷新至 IT 模式)发生故障,无法再在 PCI-E 总线上识别。我昨天用另一个 M1015(刷新至 IT 模式)替换了它,Solaris 再次找到了所有磁盘。

但是,ZFS 池被置于 SUSPENDED 模式(可能是因为之前的 M1015 在运行时死亡并且之前所有的磁盘都消失了),我看到 2 个磁盘上正在重新镀银(??),所有磁盘都被列为不可用。

我毫不怀疑池中存在一些错误。但我已经清除了(fmadm 已修复,zpool 已清除)故障,希望池可以在降级状态下重新挂载。然而,重新启动后,池首先显示为降级(一些磁盘显示为不可用,一些显示为降级),然后立即转换为暂停,所有磁盘都显示为不可用,并开始重新同步。

重新同步速度从大约 100MBps 开始,然后迅速下降到 50kbps 或更低。这相当于预计的重新同步时间有几百个小时。更重要的是,iostat 显示池中的任何磁盘上都没有发生任何事务。似乎所有事务都在重启后不久突然发生,因为我可以看到在连续重启后大约 +1GB 的扫描进度。

当池处于挂起状态时,我无法脱机任何磁盘或导出池(此外,当“fmadm 故障”条目全部报告为已修复时,我也不确定为什么它会进入挂起状态)

更换 SAS 控制器时我哪里出了问题?如何恢复?


$ zpool status
  pool: rpool
 state: ONLINE
  scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c8t0d0s1  ONLINE       0     0     0

errors: No known data errors

  pool: tank
 state: SUSPENDED
status: One or more devices is currently being resilvered.  The pool will
        continue to function in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Thu Apr 16 19:37:54 2015
    14.7G scanned out of 8.71T at 127K/s, (scan is slow, no estimated time)
    1.23G resilvered, 0.17% done
config:

        NAME                       STATE     READ WRITE CKSUM
        tank                       UNAVAIL      0     0     0
          raidz2-0                 UNAVAIL      0     0     0
            c0t5000C5005E169C55d0  UNAVAIL      0     0     0
            c0t5000C5005C08BE07d0  UNAVAIL      0     0     0
            c0t5000C5005C07780Ad0  UNAVAIL      0     0     0
            c0t5000C5005E21AE92d0  UNAVAIL      0     0     0  (resilvering)
            c0t5000C5005E0C5056d0  UNAVAIL      0     0     0
            c0t5000C5005C04F982d0  UNAVAIL      0     0     0  (resilvering)

device details:

        c0t5000C5005E169C55d0    UNAVAIL          experienced I/O failures
        status: FMA has faulted this device.
        action: Run 'fmadm faulty' for more information. Clear the errors
                using 'fmadm repaired'.

        c0t5000C5005C08BE07d0    UNAVAIL          experienced I/O failures
        status: FMA has degraded this device.
        action: Run 'fmadm faulty' for more information. Clear the errors
                using 'fmadm repaired'.
           see: http://support.oracle.com/msg/ZFS-8000-GH for recovery

        c0t5000C5005C07780Ad0    UNAVAIL          experienced I/O failures
        status: FMA has faulted this device.
        action: Run 'fmadm faulty' for more information. Clear the errors
                using 'fmadm repaired'.

        c0t5000C5005E21AE92d0    UNAVAIL          experienced I/O failures
        status: FMA has degraded this device.
        action: Run 'fmadm faulty' for more information. Clear the errors
                using 'fmadm repaired'.

        c0t5000C5005E0C5056d0    UNAVAIL          experienced I/O failures
        status: FMA has faulted this device.
        action: Run 'fmadm faulty' for more information. Clear the errors
                using 'fmadm repaired'.

        c0t5000C5005C04F982d0    UNAVAIL          experienced I/O failures
        status: FMA has degraded this device.
        action: Run 'fmadm faulty' for more information. Clear the errors
                using 'fmadm repaired'.
           see: http://support.oracle.com/msg/ZFS-8000-LR for recovery
$ iostat -en
  ---- errors ---
  s/w h/w trn tot device
    0   0   0   0 c8t0d0
    0  11   0  11 c7t0d0
    0   0   0   0 c0t5000C5005E0C5056d0
    0   0   0   0 c0t5000C5005E169C55d0
    0   0   0   0 c0t5000C5005C08BE07d0
    0   0   0   0 c0t5000C5005E21AE92d0
    0   0   0   0 c0t5000C5005C07780Ad0
    0   0   0   0 c0t5000C5005C04F982d0

相关内容