QEMU-KVM、drbd 和 corosync - VM 重启后无法工作

QEMU-KVM、drbd 和 corosync - VM 重启后无法工作

在 Debian 9.6 上,我安装了 QEMU-KVM 虚拟化。电源出现问题后,这台机器就关机了。再次打开后,我无法启动任何虚拟机,因为出现以下错误:

错误:内部错误:连接到监视器时进程退出:2022-02-03T12:01:58.403986Z qemu-system-x86_64:-drive file=/dev/drbd6,format=raw,if=none,id=drive-virtio-disk0,cache=none:无法打开'/dev/drbd6':只读文件系统

这在 4 个虚拟机中都会发生。Fdisk 仅显示以下内容:

Disk /dev/sda: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00037a37

Device     Boot    Start        End    Sectors  Size Id Type
/dev/sda1  *        2048   19531775   19529728  9.3G fd Linux raid autodetect
/dev/sda2       19531776   35155967   15624192  7.5G fd Linux raid autodetect
/dev/sda3       35155968 1939451903 1904295936  908G fd Linux raid autodetect


Disk /dev/sdb: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x000e1911

Device     Boot    Start        End    Sectors  Size Id Type
/dev/sdb1  *        2048   19531775   19529728  9.3G fd Linux raid autodetect
/dev/sdb2       19531776   35155967   15624192  7.5G fd Linux raid autodetect
/dev/sdb3       35155968 1939451903 1904295936  908G fd Linux raid autodetect


Disk /dev/md0: 9.3 GiB, 9998098432 bytes, 19527536 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/md1: 7.5 GiB, 7998525440 bytes, 15622120 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes


Disk /dev/md2: 908 GiB, 974998331392 bytes, 1904293616 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

我意识到这是 drbd 的问题(或者 corosync 也是),我不知道它存在,这就是它的解决方法。以下是一些在 botch 机器上相同的信息:

    # service drbd status
● drbd.service - LSB: Control DRBD resources.
   Loaded: loaded (/etc/init.d/drbd; generated; vendor preset: enabled)
   Active: active (exited) since Tue 2022-02-08 11:34:48 CET; 6min ago
     Docs: man:systemd-sysv-generator(8)
  Process: 12711 ExecStop=/etc/init.d/drbd stop (code=exited, status=0/SUCCESS)
  Process: 12793 ExecStart=/etc/init.d/drbd start (code=exited, status=0/SUCCESS)

Feb 08 11:34:47 brain systemd[1]: Starting LSB: Control DRBD resources....
Feb 08 11:34:47 brain drbd[12793]: Starting DRBD resources:[
Feb 08 11:34:47 brain drbd[12793]:      create res: r0 r1 r10 r2 r3 r4 r5 r6 r7 r8 r9
Feb 08 11:34:47 brain drbd[12793]:    prepare disk: r0 r1 r10 r2 r3 r4 r5 r6 r7 r8 r9
Feb 08 11:34:47 brain drbd[12793]:     adjust disk: r0:failed(apply-al:20) r1:failed(apply-al:20) r10:failed(apply-al:20) r2:failed(apply-al:20) r3:failed(apply-al:20) r4:failed(apply-al:20) r5:failed(apply
Feb 08 11:34:47 brain drbd[12793]:      adjust net: r0 r1 r10 r2 r3 r4 r5 r6 r7 r8 r9
Feb 08 11:34:47 brain drbd[12793]: ]
Feb 08 11:34:48 brain drbd[12793]: WARN: stdin/stdout is not a TTY; using /dev/console.
Feb 08 11:34:48 brain systemd[1]: Started LSB: Control DRBD resources..



# cat /proc/drbd
version: 8.4.7 (api:1/proto:86-101)
srcversion: F7D2F0C9036CD0E796D5958
 0: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 1: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 2: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 3: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 4: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 5: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 6: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 7: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 8: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
 9: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0
10: cs:Connected ro:Secondary/Secondary ds:Diskless/Diskless C r-----
    ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

当我想将 VE1 上的一个磁盘设为主磁盘时,出现了错误:

# drbdadm primary r0
0: State change failed: (-2) Need access to UpToDate data
Command 'drbdsetup-84 primary 0' terminated with exit code 17

在 VE2(辅助)上,drbdadm secondary r0 有效。

# drbdadm up r0
open(/dev/vg0/lv-sheep) failed: No such file or directory
Command 'drbdmeta 0 v08 /dev/vg0/lv-sheep internal apply-al' terminated with exit code 20

我到处都找不到 /dev/vg0。所有内容都在 /dev/drbd/vg0/by-disk/lv-sheep 中。

我不知道这些虚拟机是否存在,我是否应该执行如下命令序列:

# drbdadm create-md r0
# drbdadm up r0
# drbdadm primary r0 --force
# mkfs.ext4 /dev/drbd0

有人有什么想法吗?

编辑:附加数据

    # vgdisplay
  --- Volume group ---
  VG Name               vg0
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  26
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                11
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               908.04 GiB
  PE Size               4.00 MiB
  Total PE              232457
  Alloc PE / Size       177664 / 694.00 GiB
  Free  PE / Size       54793 / 214.04 GiB
  VG UUID               cHjzTE-lZxc-J6Qs-35jD-3kRn-csJx-g5MgNy

# cat /etc/drbd.conf
# You can find an example in  /usr/share/doc/drbd.../drbd.conf.example

include "drbd.d/global_common.conf";
include "drbd.d/*.res";


# cat /etc/drbd.d/r1.res
resource r1 {
        device          /dev/drbd1;
        disk            /dev/vg0/lv-viewcenter;
        meta-disk       internal;

        startup {
#               become-primary-on both;
        }

        net {
                allow-two-primaries;
                after-sb-0pri discard-zero-changes;
                after-sb-1pri discard-secondary;
                after-sb-2pri disconnect;
                cram-hmac-alg sha1;
                shared-secret "T/L0zE/i9eiPI";
        }

        syncer {
                rate 200M;
        }

        on brain {
                address         10.0.0.1:7789;
        }

        on pinky {
                address         10.0.0.2:7789;
        }
}

答案1

感谢 Matt Kereczman 的评论,现在一切都正常了。在执行“vgdisplay”命令后,我看到了 vg0 卷组。我使用的下一个命令是“lvdisplay”,它打印了我所有的 VM。

下一步是制作命令序列:

# vgscan --mknodes
File descriptor 8 (pipe:[270576]) leaked on vgscan invocation. Parent PID 15357: bash
Reading volume groups from cache.
Found volume group "vg0" using metadata type lvm2

# vgchange -a y
File descriptor 8 (pipe:[270576]) leaked on vgchange invocation. Parent PID 15357: bash
11 logical volume(s) in volume group "vg0" now active

所有逻辑卷都出现了。下一步是将虚拟机设为主虚拟机,启动它并启动虚拟机:

# drbdadm primary r6
# drbdadm up r6
# virsh start VM

一切开始顺利进行。

相关内容