这是我的异地 rsync 备份服务器的设置。
Ubuntu 20.10,有 9 个硬盘。
磁盘 /dev/sd[ah] 属于备份卷组。
系统位于 /dev/sdi
服务器是:
- 通过网络控制开关上电(否则与电网断电)
- 配置了网络唤醒
- 配置了 dropbear,可用于通过网络解锁 cryptfs 并允许系统启动
初始 LVM LUKS 设置:
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sda
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdb
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdc
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdd
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sde
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdf
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdg
cryptsetup luksFormat --hash=sha512 --key-size=512 --cipher=aes-xts-plain64 --verify-passphrase /dev/sdh
cryptsetup luksOpen /dev/sda luks_sda
cryptsetup luksOpen /dev/sdb luks_sdb
cryptsetup luksOpen /dev/sdc luks_sdc
cryptsetup luksOpen /dev/sdd luks_sdd
cryptsetup luksOpen /dev/sde luks_sde
cryptsetup luksOpen /dev/sdf luks_sdf
cryptsetup luksOpen /dev/sdg luks_sdg
cryptsetup luksOpen /dev/sdh luks_sdh
pvcreate /dev/mapper/luks_sda
pvcreate /dev/mapper/luks_sdb
pvcreate /dev/mapper/luks_sdc
pvcreate /dev/mapper/luks_sdd
pvcreate /dev/mapper/luks_sde
pvcreate /dev/mapper/luks_sdf
pvcreate /dev/mapper/luks_sdg
pvcreate /dev/mapper/luks_sdh
vgcreate tiburon_backup_vg /dev/mapper/luks_sda
添加了其他 /dev/mapper/luks_sd* 设备来创建 vg 使用挂载点创建了 lv
更新了每个 luks_sd* 的 /etc/crypttab:
luks_sd[a-h] /dev/sd[a-h] /etc/luks-keys/luks_sd[a-h] luks
然后更新了initramfs:
update-initramfs -uv
reboot
7 年以来一切都很好,直到现在我需要更换 /dev/sdf,因为它的坏扇区越来越多。
不知道如何在不复制 5TB 数据且不丢失数据的情况下继续操作。
这是我到目前为止发现的内容(为了不丢失数据):
cryptsetup status
cryptswap1
luks_sde
Tiburon2--vg-root
luks_sda
luks_sdf #problematic luks disk
Tiburon2--vg-swap_1
luks_sdb
luks_sdg
tiburon_backup_vg-tiburon_backup #problematic vg-lv
luks_sdc
luks_sdh
luks_sdd
sdb5_crypt
cryptsetup status luks_sdf
/dev/mapper/luks_sdf is active and is in use.
type: LUKS1
cipher: aes-xts-plain64
keysize: 512 bits
key location: dm-crypt
device: /dev/sdf
sector size: 512
offset: 4096 sectors
size: 3907025072 sectors
mode: read/write
umount /tiburon_backup
vgchange -a n tiburon_backup_vg
0 logical volume(s) in volume group "tiburon_backup_vg" now active
pvmove /dev/mapper/luks_sdf
Insufficient free space: 476931 extents needed, but only 1 available
Unable to allocate mirror extents for tiburon_backup_vg/pvmove0.
Failed to convert pvmove LV to mirrored.
#Therefore:
e2fsck -f /dev/mapper/tiburon_backup_vg-tiburon_backup
#FS/VG has 8TB, and 4TB is in use, therefore shrinking it to 5TB:
resize2fs -p /dev/mapper/tiburon_backup_vg-tiburon_backup 5T
Początkowy przebieg 2 (maksymalny = 262145)
Relokowanie bloków XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Początkowy przebieg 3 (maksymalny = 40960)
Przeszukiwanie tablicy i-węzłówXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
lvreduce -L 5T /dev/mapper/tiburon_backup_vg-tiburon_backup
WARNING: Reducing active logical volume to 5,00 TiB.
THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce tiburon_backup_vg/tiburon_backup? [y/n]: y
Size of logical volume tiburon_backup_vg/tiburon_backup changed from <7,80 TiB (2043653 extents) to 5,00 TiB.
Logical volume tiburon_backup_vg/tiburon_backup successfully resized.
e2fsck -f /dev/mapper/tiburon_backup_vg-tiburon_backup
e2fsck 1.45.6 (20-Mar-2020)
Przebieg 1: Sprawdzanie i-węzłów, bloków i rozmiarów
Przebieg 2: Sprawdzanie struktury katalogów
Przebieg 3: Sprawdzanie łączności katalogów
Przebieg 4: Sprawdzanie liczników odwołań
Przebieg 5: Sprawdzanie sumarycznych informacji o grupach
/dev/mapper/tiburon_backup_vg-tiburon_backup: 11/176128 plików (0.0% nieciągłych), 281453/1408000 bloków
现在,pvscan 显示 /dev/mapper/luks_sdf 为空:
PV /dev/mapper/luks_sdf VG tiburon_backup_vg lvm2 [<1,82 TiB / 1,82 TiB 可用]
因此,如果我现在运行:
pvmove /dev/mapper/luks_sdf
它应该将该 pv 的剩余块镜像到 vg 内的其他可用空间,对吧? (或不?)
之后,我打算做:
vgchange -a n tiburon_backup_vg
cryptsetup close luks_sdf
vgreduce tiburon_backup_vg /dev/mapper/luks_sdf
pvremove /dev/sdf
#remove luks_sdf from /etc/crypttab
这行得通吗?还是有更好的方法从 LUKS 上的虚拟机上删除有故障的磁盘?
非常感谢您的任何想法。
答案1
您的操作顺序需要稍作修改。
是的,pvmove
如果还有剩余的已分配块,就会这样做。如果/dev/mapper/luks_sdf
实际上已经完全没有 LVM 数据,那也没有什么坏处。
如果成功,应该在和字段pvdisplay /dev/mapper/luks_sdf
中显示完全相同的值,并且应该为 0。Total PE
Free PE
Allocated PE
那时,您不必执行vgchange -a n tiburon_backup_vg
;只需执行 avgreduce tiburon_backup_vg /dev/mapper/luks_sdf
将其从 VG 中删除(因为它现在没有 LVM 数据,您可以在线执行此操作)。
由于 LVM 位于 LUKS 之上,因此执行此操作很重要前 cryptsetup close luks_sdf
,因为之后系统只会看到/dev/sdf
: 的加密内容,如果您尝试,pvremove /dev/sdf
它会告诉您没有要删除的 LVM 标头(因为它只会看到无意义的加密数据)。
在这种情况下,pvremove
不需要运行:只要磁盘已从 VG 中移除,LVM 将不再要求它存在,并且即使您热拔出它也不会介意。 (如果您的硬件实际上不支持热插拔,请勿热拔出。)
在关闭之前,请记住删除或注释掉/dev/sdf
并/etc/crypttab
更新 initramfs,否则系统将在启动时将您置于紧急模式,因为它会尝试激活 LUKS,/dev/sdf
并且不再找到该磁盘(或者会找到新的磁盘)。其位置没有现有 LUKS 标头的磁盘)。
答案2
我希望这个测试对将来的人有用。
root@Tiburon3:~# pvscan
PV /dev/mapper/luks_sda VG bck_vg lvm2 [<232.87 GiB / 0 free]
PV /dev/mapper/luks_sdb VG bck_vg lvm2 [<931.50 GiB / 0 free]
PV /dev/mapper/luks_sdc VG bck_vg lvm2 [<1.82 TiB / 0 free]
PV /dev/mapper/luks_sdd VG bck_vg lvm2 [149.03 GiB / 0 free]
PV /dev/mapper/luks_sde VG bck_vg lvm2 [<1.82 TiB / 0 free]
PV /dev/mapper/luks_sdf VG bck_vg lvm2 [<1.82 TiB / 0 free]
PV /dev/mapper/luks_sdg VG bck_vg lvm2 [<931.50 GiB / 0 free]
PV /dev/mapper/luks_sdh VG bck_vg lvm2 [149.03 GiB / 79.83 GiB free]
PV /dev/mapper/sdb5_crypt VG Tiburon3-vg lvm2 [<148.53 GiB / <111.28 GiB free]
Total: 9 [7.94 TiB] / in use: 9 [7.94 TiB] / in no VG: 0 [0 ]
root@Tiburon3:~# df -hP /tiburon_backup/
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/bck_vg-lv_tib_bck 7.7T 93M 7.3T 1% /tiburon_backup
#用虚拟数据填充 FS 以到达 VG 的末尾:
cd /tiburon_backup/
FROMHERE=848
for ((i=FROMHERE; i>=1; i--))
do
fallocate -l 10GB gentoo_root$i.img
done
root@Tiburon3:/tiburon_backup# df -hP /tiburon_backup/
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/bck_vg-lv_tib_bck 7.7T 7.3T 20G 100% /tiburon_backup
fallocate -l 10G gentoo_root000.img
root@Tiburon3:/tiburon_backup# pvscan
PV /dev/mapper/luks_sda VG bck_vg lvm2 [<232.87 GiB / 0 free]
PV /dev/mapper/luks_sdb VG bck_vg lvm2 [<931.50 GiB / 0 free]
PV /dev/mapper/luks_sdc VG bck_vg lvm2 [<1.82 TiB / 0 free]
PV /dev/mapper/luks_sdd VG bck_vg lvm2 [149.03 GiB / 0 free]
PV /dev/mapper/luks_sde VG bck_vg lvm2 [<1.82 TiB / 0 free]
PV /dev/mapper/luks_sdf VG bck_vg lvm2 [<1.82 TiB / 0 free]
PV /dev/mapper/luks_sdg VG bck_vg lvm2 [<931.50 GiB / 0 free]
PV /dev/mapper/luks_sdh VG bck_vg lvm2 [149.03 GiB / 79.83 GiB free]
rm -f gentoo_root[1-9]*
root@Tiburon3:/tiburon_backup# df -hP .
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/bck_vg-lv_tib_bck 7.7T 10G 7.3T 1% /tiburon_backup
root@Tiburon3:/tiburon_backup# du -sh *
10G gentoo_root000.img
root@Tiburon3:/#
btrfs filesystem resize -1T /tiburon_backup
Output:
Resize '/tiburon_backup' of '-1T'
请注意,与调整空文件系统的大小相比,它花费了相当多的时间。
我想知道是否可以使用一些 --progress 或 --verbose 参数来查看更多输出。
root@Tiburon3:/tiburon_backup# df -hP .
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/bck_vg-lv_tib_bck 6.8T 12G 6.8T 1% /tiburon_backup
umount /tiburon_backup
root@Tiburon3:/# pvscan
PV /dev/mapper/luks_sda VG bck_vg lvm2 [<232.87 GiB / <232.87 GiB free]
PV /dev/mapper/luks_sdb VG bck_vg lvm2 [<931.50 GiB / <931.50 GiB free]
PV /dev/mapper/luks_sdc VG bck_vg lvm2 [<1.82 TiB / 0 free]
PV /dev/mapper/luks_sdd VG bck_vg lvm2 [149.03 GiB / 149.03 GiB free]
PV /dev/mapper/luks_sde VG bck_vg lvm2 [<1.82 TiB / 0 free]
PV /dev/mapper/luks_sdf VG bck_vg lvm2 [<1.82 TiB / 469.00 GiB free]
PV /dev/mapper/luks_sdg VG bck_vg lvm2 [<931.50 GiB / <931.50 GiB free]
PV /dev/mapper/luks_sdh VG bck_vg lvm2 [149.03 GiB / 149.03 GiB free] #Lets remove this one, for testing purposes, smallest size
PV /dev/mapper/sdb5_crypt VG Tiburon3-vg lvm2 [<148.53 GiB / <111.28 GiB free]
Total: 9 [7.94 TiB] / in use: 9 [7.94 TiB] / in no VG: 0 [0 ]
root@Tiburon3:/# pvdisplay /dev/mapper/luks_sdh
--- Physical volume ---
PV Name /dev/mapper/luks_sdh
VG Name bck_vg
PV Size 149.03 GiB / not usable <3.84 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 38152
Free PE 20437
Allocated PE 17715 **#allright!, this is the scenario I wanted to test. Data migration before removing the disk.**
PV UUID VRhdHD-5aam-9Wha-Qzwg-f8Iz-hosM-Wz0Q3q
root@Tiburon3:/# pvmove /dev/mapper/luks_sdh
No extents available for allocation.
#为了安全起见,我不会将 LV 减少全部 1TB,而是减少 800GB,应该足以从 luks_sdh 移动剩余分配的 PE...
root@Tiburon3:/# lvreduce -L -800G /dev/mapper/bck_vg-lv_tib_bck
WARNING: Reducing active logical volume to <6.94 TiB.
THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you really want to reduce bck_vg/lv_tib_bck? [y/n]: y
Size of logical volume bck_vg/lv_tib_bck changed from <7.72 TiB (2023191 extents) to <6.94 TiB (1818391 extents).
Logical volume bck_vg/lv_tib_bck successfully resized.
root@Tiburon3:/# pvmove /dev/mapper/luks_sdh --alloc anywhere
/dev/mapper/luks_sdh: Moved: 0.06%
[...]
/dev/mapper/luks_sdh: Moved: 82.00%
[...]
/dev/mapper/luks_sdh: Moved: 99.99%
Done!
root@Tiburon3:/# pvdisplay /dev/mapper/luks_sdh
--- Physical volume ---
PV Name /dev/mapper/luks_sdh
VG Name bck_vg
PV Size 149.03 GiB / not usable <3.84 MiB
Allocatable yes
PE Size 4.00 MiB
Total PE 38152
Free PE 38152
Allocated PE 0 **#DATA WAS MOVED OUT OF THIS PV!**
PV UUID VRhdHD-5aam-9Wha-Qzwg-f8Iz-hosM-Wz0Q3q
root@Tiburon3:/# vgreduce bck_vg /dev/mapper/luks_sdh
Removed "/dev/mapper/luks_sdh" from volume group "bck_vg"
root@Tiburon3:/# mount -a
root@Tiburon3:/# cd /tiburon_backup/
root@Tiburon3:/tiburon_backup# du -sh *
10G gentoo_root000.img **#DATA IS INTACT**
现在:
cryptsetup 关闭 luks_sdh
现在,正如上面 telcoM 所建议的,明智的做法是从 /etc/crypttab 中删除(或注释掉)luks_sdh 并更新 initramfs:
更新 initramfs -uv
和重启测试它是否按预期工作。
现在我将在我的 Private Prod Env 上运行它;)
@telcoM,非常感谢您的建议!