加密 LUKS LVM 中的磁盘更换不会丢失数据

加密 LUKS LVM 中的磁盘更换不会丢失数据

这是我的异地 rsync 备份服务器的设置。

Ubuntu 20.10,有 9 个硬盘。

磁盘 /dev/sd[ah] 属于备份卷组。

系统位于 /dev/sdi

服务器是:

  • 通过网络控制开关上电(否则与电网断电)
  • 配置了网络唤醒
  • 配置了 dropbear,可用于通过网络解锁 cryptfs 并允许系统启动

初始 LVM LUKS 设置:

cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sda
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdb
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdc
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdd
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sde
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdf
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdg
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdh

cryptsetup luksOpen /dev/sda luks_sda
cryptsetup luksOpen /dev/sdb luks_sdb
cryptsetup luksOpen /dev/sdc luks_sdc
cryptsetup luksOpen /dev/sdd luks_sdd
cryptsetup luksOpen /dev/sde luks_sde
cryptsetup luksOpen /dev/sdf luks_sdf
cryptsetup luksOpen /dev/sdg luks_sdg
cryptsetup luksOpen /dev/sdh luks_sdh
pvcreate /dev/mapper/luks_sda
pvcreate /dev/mapper/luks_sdb
pvcreate /dev/mapper/luks_sdc
pvcreate /dev/mapper/luks_sdd
pvcreate /dev/mapper/luks_sde
pvcreate /dev/mapper/luks_sdf
pvcreate /dev/mapper/luks_sdg
pvcreate /dev/mapper/luks_sdh
vgcreate tiburon_backup_vg /dev/mapper/luks_sda

添加了其他 /dev/mapper/luks_sd* 设备来创建 vg 使用挂载点创建了 lv

更新了每个 luks_sd* 的 /etc/crypttab:

luks_sd[a-h] /dev/sd[a-h] /etc/luks-keys/luks_sd[a-h] luks

然后更新了initramfs:

update-initramfs -uv
reboot

7 年以来一切都很好,直到现在我需要更换 /dev/sdf,因为它的坏扇区越来越多。

不知道如何在不复制 5TB 数据且不丢失数据的情况下继续操作。

这是我到目前为止发现的内容(为了不丢失数据):

cryptsetup status

cryptswap1
luks_sde
Tiburon2--vg-root
luks_sda
luks_sdf                 #problematic luks disk
Tiburon2--vg-swap_1
luks_sdb
luks_sdg
tiburon_backup_vg-tiburon_backup          #problematic vg-lv
luks_sdc
luks_sdh
luks_sdd
sdb5_crypt

cryptsetup status luks_sdf

/dev/mapper/luks_sdf is active and is in use.
  type:    LUKS1
  cipher:  aes-xts-plain64
  keysize: 512 bits
  key location: dm-crypt
  device:  /dev/sdf
  sector size:  512
  offset:  4096 sectors
  size:    3907025072 sectors
  mode:    read/write



umount /tiburon_backup

vgchange -a n tiburon_backup_vg

  0 logical volume(s) in volume group "tiburon_backup_vg" now active


pvmove /dev/mapper/luks_sdf

  Insufficient free space: 476931 extents needed, but only 1 available
  Unable to allocate mirror extents for tiburon_backup_vg/pvmove0.
  Failed to convert pvmove LV to mirrored.


#Therefore:
e2fsck -f /dev/mapper/tiburon_backup_vg-tiburon_backup


#FS/VG has 8TB, and 4TB is in use, therefore shrinking it to 5TB:

resize2fs -p /dev/mapper/tiburon_backup_vg-tiburon_backup  5T

Początkowy przebieg 2 (maksymalny = 262145)
Relokowanie bloków           XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Początkowy przebieg 3 (maksymalny = 40960)
Przeszukiwanie tablicy i-węzłówXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX


lvreduce -L 5T /dev/mapper/tiburon_backup_vg-tiburon_backup

  WARNING: Reducing active logical volume to 5,00 TiB.
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce tiburon_backup_vg/tiburon_backup? [y/n]: y
  Size of logical volume tiburon_backup_vg/tiburon_backup changed from <7,80 TiB (2043653 extents) to 5,00 TiB.
  Logical volume tiburon_backup_vg/tiburon_backup successfully resized.



e2fsck -f /dev/mapper/tiburon_backup_vg-tiburon_backup

e2fsck 1.45.6 (20-Mar-2020)
Przebieg 1: Sprawdzanie i-węzłów, bloków i rozmiarów
Przebieg 2: Sprawdzanie struktury katalogów
Przebieg 3: Sprawdzanie łączności katalogów
Przebieg 4: Sprawdzanie liczników odwołań
Przebieg 5: Sprawdzanie sumarycznych informacji o grupach
/dev/mapper/tiburon_backup_vg-tiburon_backup: 11/176128 plików (0.0% nieciągłych), 281453/1408000 bloków



现在,pvscan 显示 /dev/mapper/luks_sdf 为空:

PV /dev/mapper/luks_sdf VG tiburon_backup_vg lvm2 [<1,82 TiB / 1,82 TiB 可用]

因此,如果我现在运行:

pvmove /dev/mapper/luks_sdf

它应该将该 pv 的剩余块镜像到 vg 内的其他可用空间,对吧? (或不?)

之后,我打算做:

vgchange -a n tiburon_backup_vg

cryptsetup close luks_sdf

vgreduce tiburon_backup_vg /dev/mapper/luks_sdf

pvremove /dev/sdf


#remove luks_sdf from /etc/crypttab

这行得通吗?还是有更好的方法从 LUKS 上的虚拟机上删除有故障的磁盘?

非常感谢您的任何想法。

答案1

您的操作顺序需要稍作修改。

是的,pvmove如果还有剩余的已分配块,就会这样做。如果/dev/mapper/luks_sdf实际上已经完全没有 LVM 数据,那也没有什么坏处。

如果成功,应该在和字段pvdisplay /dev/mapper/luks_sdf中显示完全相同的值,并且应该为 0。Total PEFree PEAllocated PE

那时,您不必执行vgchange -a n tiburon_backup_vg;只需执行 avgreduce tiburon_backup_vg /dev/mapper/luks_sdf将其从 VG 中删除(因为它现在没有 LVM 数据,您可以在线执行此操作)。

由于 LVM 位于 LUKS 之上,因此执行此操作很重要 cryptsetup close luks_sdf,因为之后系统只会看到/dev/sdf: 的加密内容,如果您尝试,pvremove /dev/sdf它会告诉您没有要删除的 LVM 标头(因为它只会看到无意义的加密数据)。

在这种情况下,pvremove不需要运行:只要磁盘已从 VG 中移除,LVM 将不再要求它存在,并且即使您热拔出它也不会介意。 (如果您的硬件实际上不支持热插拔,请勿热拔出。)

在关闭之前,请记住删除或注释掉/dev/sdf/etc/crypttab更新 initramfs,否则系统将在启动时将您置于紧急模式,因为它会尝试激活 LUKS,/dev/sdf并且不再找到该磁盘(或者会找到新的磁盘)。其位置没有现有 LUKS 标头的磁盘)。

答案2

我希望这个测试对将来的人有用。

root@Tiburon3:~# pvscan
  PV /dev/mapper/luks_sda     VG bck_vg          lvm2 [<232.87 GiB / 0    free]
  PV /dev/mapper/luks_sdb     VG bck_vg          lvm2 [<931.50 GiB / 0    free]
  PV /dev/mapper/luks_sdc     VG bck_vg          lvm2 [<1.82 TiB / 0    free]
  PV /dev/mapper/luks_sdd     VG bck_vg          lvm2 [149.03 GiB / 0    free]
  PV /dev/mapper/luks_sde     VG bck_vg          lvm2 [<1.82 TiB / 0    free]
  PV /dev/mapper/luks_sdf     VG bck_vg          lvm2 [<1.82 TiB / 0    free]
  PV /dev/mapper/luks_sdg     VG bck_vg          lvm2 [<931.50 GiB / 0    free]
  PV /dev/mapper/luks_sdh     VG bck_vg          lvm2 [149.03 GiB / 79.83 GiB free]
  PV /dev/mapper/sdb5_crypt   VG Tiburon3-vg     lvm2 [<148.53 GiB / <111.28 GiB free]
  Total: 9 [7.94 TiB] / in use: 9 [7.94 TiB] / in no VG: 0 [0   ]

root@Tiburon3:~# df -hP /tiburon_backup/
Filesystem                     Size  Used Avail Use% Mounted on
/dev/mapper/bck_vg-lv_tib_bck  7.7T   93M  7.3T   1% /tiburon_backup

#用虚拟数据填充 FS 以到达 VG 的末尾:

cd /tiburon_backup/

FROMHERE=848
for ((i=FROMHERE; i>=1; i--))
do
    fallocate -l 10GB gentoo_root$i.img
done


root@Tiburon3:/tiburon_backup# df -hP /tiburon_backup/
Filesystem                     Size  Used Avail Use% Mounted on
/dev/mapper/bck_vg-lv_tib_bck  7.7T  7.3T   20G 100% /tiburon_backup

fallocate -l 10G gentoo_root000.img

root@Tiburon3:/tiburon_backup# pvscan
  PV /dev/mapper/luks_sda     VG bck_vg          lvm2 [<232.87 GiB / 0    free]
  PV /dev/mapper/luks_sdb     VG bck_vg          lvm2 [<931.50 GiB / 0    free]
  PV /dev/mapper/luks_sdc     VG bck_vg          lvm2 [<1.82 TiB / 0    free]
  PV /dev/mapper/luks_sdd     VG bck_vg          lvm2 [149.03 GiB / 0    free]
  PV /dev/mapper/luks_sde     VG bck_vg          lvm2 [<1.82 TiB / 0    free]
  PV /dev/mapper/luks_sdf     VG bck_vg          lvm2 [<1.82 TiB / 0    free]
  PV /dev/mapper/luks_sdg     VG bck_vg          lvm2 [<931.50 GiB / 0    free]
  PV /dev/mapper/luks_sdh     VG bck_vg          lvm2 [149.03 GiB / 79.83 GiB free]

  rm -f gentoo_root[1-9]*


root@Tiburon3:/tiburon_backup# df -hP .
  Filesystem                     Size  Used Avail Use% Mounted on
  /dev/mapper/bck_vg-lv_tib_bck  7.7T   10G  7.3T   1% /tiburon_backup


root@Tiburon3:/tiburon_backup# du -sh *
  10G   gentoo_root000.img

root@Tiburon3:/# 
btrfs filesystem resize -1T /tiburon_backup
Output:
Resize '/tiburon_backup' of '-1T'

请注意,与调整空文件系统的大小相比,它花费了相当多的时间。

我想知道是否可以使用一些 --progress 或 --verbose 参数来查看更多输出。

root@Tiburon3:/tiburon_backup# df -hP .
Filesystem                     Size  Used Avail Use% Mounted on
/dev/mapper/bck_vg-lv_tib_bck  6.8T   12G  6.8T   1% /tiburon_backup


umount /tiburon_backup


root@Tiburon3:/# pvscan
  PV /dev/mapper/luks_sda     VG bck_vg          lvm2 [<232.87 GiB / <232.87 GiB free]
  PV /dev/mapper/luks_sdb     VG bck_vg          lvm2 [<931.50 GiB / <931.50 GiB free]
  PV /dev/mapper/luks_sdc     VG bck_vg          lvm2 [<1.82 TiB / 0    free]
  PV /dev/mapper/luks_sdd     VG bck_vg          lvm2 [149.03 GiB / 149.03 GiB free]
  PV /dev/mapper/luks_sde     VG bck_vg          lvm2 [<1.82 TiB / 0    free]
  PV /dev/mapper/luks_sdf     VG bck_vg          lvm2 [<1.82 TiB / 469.00 GiB free]
  PV /dev/mapper/luks_sdg     VG bck_vg          lvm2 [<931.50 GiB / <931.50 GiB free]

  PV /dev/mapper/luks_sdh     VG bck_vg          lvm2 [149.03 GiB / 149.03 GiB free] #Lets remove this one, for testing purposes, smallest size

  PV /dev/mapper/sdb5_crypt   VG Tiburon3-vg     lvm2 [<148.53 GiB / <111.28 GiB free]
  Total: 9 [7.94 TiB] / in use: 9 [7.94 TiB] / in no VG: 0 [0   ]


root@Tiburon3:/# pvdisplay /dev/mapper/luks_sdh
  --- Physical volume ---
  PV Name               /dev/mapper/luks_sdh
  VG Name               bck_vg
  PV Size               149.03 GiB / not usable <3.84 MiB
  Allocatable           yes
  PE Size               4.00 MiB
  Total PE              38152
  Free PE               20437

  Allocated PE          17715   **#allright!, this is the scenario I wanted to test. Data migration before removing the disk.**

  PV UUID               VRhdHD-5aam-9Wha-Qzwg-f8Iz-hosM-Wz0Q3q



root@Tiburon3:/# pvmove /dev/mapper/luks_sdh
    No extents available for allocation.

#为了安全起见,我不会将 LV 减少全部 1TB,而是减少 800GB,应该足以从 luks_sdh 移动剩余分配的 PE...

root@Tiburon3:/# lvreduce -L -800G /dev/mapper/bck_vg-lv_tib_bck
      WARNING: Reducing active logical volume to <6.94 TiB.
      THIS MAY DESTROY YOUR DATA (filesystem etc.)
    Do you really want to reduce bck_vg/lv_tib_bck? [y/n]: y
      Size of logical volume bck_vg/lv_tib_bck changed from <7.72 TiB (2023191 extents) to <6.94 TiB (1818391 extents).
      Logical volume bck_vg/lv_tib_bck successfully resized.


root@Tiburon3:/# pvmove /dev/mapper/luks_sdh  --alloc anywhere
  /dev/mapper/luks_sdh: Moved: 0.06%
[...]
  /dev/mapper/luks_sdh: Moved: 82.00%
[...]
  /dev/mapper/luks_sdh: Moved: 99.99%
Done!

root@Tiburon3:/# pvdisplay /dev/mapper/luks_sdh
  --- Physical volume ---
  PV Name               /dev/mapper/luks_sdh
  VG Name               bck_vg
  PV Size               149.03 GiB / not usable <3.84 MiB
  Allocatable           yes
  PE Size               4.00 MiB
  Total PE              38152
  Free PE               38152
  
  Allocated PE          0       **#DATA WAS MOVED OUT OF THIS PV!**
  
  PV UUID               VRhdHD-5aam-9Wha-Qzwg-f8Iz-hosM-Wz0Q3q



root@Tiburon3:/# vgreduce bck_vg /dev/mapper/luks_sdh
    Removed "/dev/mapper/luks_sdh" from volume group "bck_vg"


root@Tiburon3:/# mount -a

root@Tiburon3:/# cd /tiburon_backup/

root@Tiburon3:/tiburon_backup# du -sh *
    10G gentoo_root000.img    **#DATA IS INTACT**

现在:

cryptsetup 关闭 luks_sdh

现在,正如上面 telcoM 所建议的,明智的做法是从 /etc/crypttab 中删除(或注释掉)luks_sdh 并更新 initramfs:

更新 initramfs -uv

重启测试它是否按预期工作。

现在我将在我的 Private Prod Env 上运行它;)

@telcoM,非常感谢您的建议!

相关内容