成功再次组建RAID后,如何挂载并恢复数据?

成功再次组建RAID后,如何挂载并恢复数据?

我们有一台 Synology NAS,2 个磁盘中的一个崩溃了,尽管智能已通过,数据完好无损。请注意,虽然磁盘不在 raid 中,但几年前我们选择了 SHR,显然 Synology 考虑了 1 个磁盘 raid。

我们先行取出崩溃的磁盘并更换它,现在我们想要恢复数据。我们将磁盘放在 debian 10 盒子里。

按照许多指南,我们设法重新创建了一个磁盘 raid,但仍然无法安装它,因为 debian 无法识别 fs。首先将磁盘放在 windows pc 上,我们发现 synology 将其格式化为三个部分,第一个是带有元数据的 ext4,我想因为我们可以看到文件夹结构,第二个和第三个包含所有数据,是 RAW。

在 debian 上采取的措施:

lsblk

sdc                            8:32   0   2.7T  0 disk  
├─sdc1                         8:33   0   2.4G  0 part  
├─sdc2                         8:34   0     2G  0 part  
├─sdc3                         8:35   0   2.7T  0 part

cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md127 : inactive sdc3[0](S)
      2925444560 blocks super 1.2
       
unused devices: <none>

如果我们不让它活跃起来,我猜安装就没有意义:

mdadm -Asf
   mdadm: No arrays found in config file or automatically
mdadm -A /dev/md127
mdadm: /dev/md127 not identified in config file.

/etc/mdadm/mdadm.conf 中缺少某些内容:

mdadm --detail --scan 
INACTIVE-ARRAY /dev/md127 metadata=1.2 name=DiskStation:2 UUID=51fe32e5:d1a74bf2:7c07fbd0:ce554944

mdadm --examine --scan >> /etc/mdadm/mdadm.conf
ARRAY /dev/md/2  metadata=1.2 UUID=51fe32e5:d1a74bf2:7c07fbd0:ce554944 name=DiskStation:2

我不知道为什么第二个命令输出不存在的 /dev/md/2,但是我进入 mdadm.conf 并将其替换为 md127.所以现在的配置是:

# This configuration was auto-generated on Sat, 02 Nov 2019 20:42:28 +0200 by mkconf
ARRAY /dev/md127  metadata=1.2 UUID=51fe32e5:d1a74bf2:7c07fbd0:ce554944 name=DiskStation:2

我重新启动了机器,然后执行以下操作:

mdadm --stop /dev/md127 
mdadm: stopped /dev/md127

否则显然如果你尝试构建/组装突袭它会说sdc3正忙。然后我做了:

mdadm --build /dev/md127 --force --level=1 --raid-devices=1 /dev/sdc
mdadm: array /dev/md127 built and started.
cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
md127 : active raid1 sdc[0]
      2930266584 blocks super non-persistent [1/1] [U]
      
unused devices: <none>

这就是进步。但是它仍然无法识别 fs 来安装磁盘。新的 lsblk:

sdc                            8:32   0   2.7T  0 disk  
├─sdc1                         8:33   0   2.4G  0 part  
├─sdc2                         8:34   0     2G  0 part  
├─sdc3                         8:35   0   2.7T  0 part  
└─md127                        9:127  0   2.7T  0 raid1 
  ├─md127p1                  259:0    0   2.4G  0 part  
  ├─md127p2                  259:1    0     2G  0 part  
  └─md127p3                  259:2    0   2.7T  0 part

现在我尝试挂载到一个目录:

mount /dev/md127 /mnt/oldsynology/ -o ro
mount: /mnt/oldsynology: wrong fs type, bad option, bad superblock on /dev/md127, missing codepage or helper program, or other error.
mount /dev/md127p3 /mnt/oldsynology/ -o ro
mount: /mnt/oldsynology: unknown filesystem type 'linux_raid_member'.
mount /dev/sdc /mnt/oldsynology/ -o ro
mount: /mnt/oldsynology: /dev/sdc already mounted or mount point busy.
mount /dev/sdc3 /mnt/oldsynology/ -o ro
mount: /mnt/oldsynology: unknown filesystem type 'linux_raid_member'.

我也尝试使用 mount -t ext4 执行上述挂载命令,结果相同。

尽管当我将驱动器重新安装到 Synology 中时,我的 smartctl 仍可以毫无问题地读取数据。

smartctl -a /dev/sdc3
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.3.13-1-pve] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD30EFRX-68EUZN0
Serial Number:    WD-WCC4N2ELDDTF
LU WWN Device Id: 5 0014ee 20cc90c0c
Firmware Version: 82.00A82
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Jun 30 23:53:12 2020 EEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                    was never started.
                    Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (38880) seconds.
Offline data collection
capabilities:            (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:    (   2) minutes.
Extended self-test routine
recommended polling time:    ( 390) minutes.
Conveyance self-test routine
recommended polling time:    (   5) minutes.
SCT capabilities:          (0x703d) SCT Status supported.
                    SCT Error Recovery Control supported.
                    SCT Feature Control supported.
                    SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       37389
  3 Spin_Up_Time            0x0027   170   170   021    Pre-fail  Always       -       6466
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       33
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       16
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   050   050   000    Old_age   Always       -       36983
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       32
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       22
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       670
194 Temperature_Celsius     0x0022   105   104   000    Old_age   Always       -       45
196 Reallocated_Event_Count 0x0032   186   186   000    Old_age   Always       -       14
197 Current_Pending_Sector  0x0032   199   199   000    Old_age   Always       -       529
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       1

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Interrupted (host reset)      90%     36677         -
# 2  Short offline       Interrupted (host reset)      10%     36671         -
# 3  Extended offline    Interrupted (host reset)      10%     36626         -
# 4  Extended offline    Completed: read failure       10%     35995         30655688
# 5  Extended offline    Interrupted (host reset)      10%     35289         -
# 6  Extended offline    Interrupted (host reset)      10%     34262         -
# 7  Extended offline    Interrupted (host reset)      10%     33403         -
# 8  Extended offline    Interrupted (host reset)      90%     33174         -
# 9  Extended offline    Interrupted (host reset)      10%     33097         -
#10  Extended offline    Interrupted (host reset)      10%     33000         -
#11  Extended offline    Interrupted (host reset)      10%     32792         -
#12  Extended offline    Interrupted (host reset)      10%     32593         -
#13  Extended offline    Completed: read failure       10%     32350         30655688
#14  Extended offline    Completed: read failure       10%     28883         30655688
#15  Extended offline    Interrupted (host reset)      60%     27995         -
#16  Extended offline    Completed without error       00%     27835         -
#17  Extended offline    Completed without error       00%     27676         -
#18  Extended offline    Completed without error       00%     27508         -
#19  Extended offline    Completed without error       00%     27334         -
#20  Extended offline    Completed without error       00%     27167         -
#21  Extended offline    Completed without error       00%     26999         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

我希望有人能解答这个难题。到目前为止还没有在任何类似的帖子中找到解决方案,每个案例中总是存在略微不同的变量。

答案1

这说明:

mount /dev/md127p3 /mnt/oldsynology/ -o ro
mount: /mnt/oldsynology: unknown filesystem type 'linux_raid_member'.

看起来它/dev/md127p3本身包含另一个数组,您可以使用它来发现它mdadm -E /dev/md127p3

也就是说,在 Linux 机器上使用 SHR 之前,我会简单地尝试克隆磁盘并将其插入原始 DiskStation。

相关内容