LVM DRBD 在节点 proxmox 之间共享

LVM DRBD 在节点 proxmox 之间共享

上下文:我有一个 proxmox 集群,其中的卷组与 drbd 共享,用于 VM - KVM。

我在使用 drbd 时遇到了问题。事实上,当一个节点断开连接时,不同节点之间的关联链接就会断开。影响如下:

  • 状态 drbd 节点:cs: WFConnection / cs: Standalone
  • 如果我们需要重新启动 drbd 并且重新同步服务正常(cs:已连接),我会执行以下操作:

    我停止正在运行的虚拟机(因为:错误:模块 drbd 正在使用中)我禁用了 vg(使用 vgchange -an)我重新启动 drbd 服务并重新同步运行

配置如下:

/etc/drbd.conf:

# You can find an example in  /usr/share/doc/drbd.../drbd.conf.example

include "drbd.d/global_common.conf";
include "drbd.d/*.res";

全局_公共.conf:

global { usage-count no; }
common {
    protocol C;
    startup {
        degr-wfc-timeout 120;
#        become-primary-on proxmox001;
        become-primary-on both;
    }
    disk {
    }
    net {
        allow-two-primaries;
        after-sb-0pri discard-zero-changes;
        after-sb-1pri discard-secondary;
        after-sb-2pri disconnect;
    }
    syncer {
        verify-alg md5;
        rate 30M;
    }
}

R0.res:

resource r0 {
        protocol C;
        on proxmox001 {
                device /dev/drbd0;
                disk /dev/mapper/pve-lv_data;
                address 192.168.0.1:7788;
                meta-disk internal;
        }
        on proxmox002 {
                device /dev/drbd0;
                disk /dev/mapper/pve-lv_data;
                address 192.168.0.2:7788;
                meta-disk internal;
        }
}

当物理主机不稳定时,我失去了 drbd 链接。

在正常情况下,/proc/drbd :

GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@proxmox001, 2013-04-24 12:55:32
 0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
    ns:421142 nr:55715 dw:8498959 dr:10994034 al:1144 bm:420 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

当不稳定时,/proc/drbd 返回:

cs:Standalone or cs:WFConnection st:Secondary/Unknown

当我重新启动 DRBD Sync 时,出现了 UpToDate 错误或

ERROR: Module drbd is in use proxmox

我尝试了不同的测试然后谷歌建议:

root@proxmox001:~# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "pve" using metadata type lvm2
  Found volume group "drbdvg" using metadata type lvm2


root@proxmox001:~# vgchange -an /dev/drbdvg
  Can't deactivate volume group "drbdvg" with 1 open logical volume(s)


root@proxmox001:~# /sbin/vgchange -a y
  4 logical volume(s) in volume group "pve" now active
  2 logical volume(s) in volume group "drbdvg" now active

root@proxmox001:~# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@proxmox001, 2013-04-24 12:55:32
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:8523047 dr:11025118 al:1147 bm:421 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8080

root@proxmox001:~# drbdadm connect all

root@proxmox001:~# drbdadm verify r0
0: State change failed: (-15) Need a connection to start verify or resync
Command 'drbdsetup 0 verify' terminated with exit code 11

root@proxmox001:~# cat /proc/drbd
version: 8.3.13 (api:88/proto:86-96)
GIT-hash: 83ca112086600faacab2f157bc5a9324f7bd7f77 build by root@proxmox001, 2013-04-24 12:55:32
 0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r-----
    ns:0 nr:0 dw:8523371 dr:11025534 al:1147 bm:421 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:8288

root@proxmox001:~# drbdadm secondary all
0: State change failed: (-12) Device is held open by someone
Command 'drbdsetup 0 secondary' terminated with exit code 11

root@proxmox001:~# drbdadm up r0
0: Failure: (124) Device is attached to a disk (use detach first)
Command 'drbdsetup 0 disk /dev/mapper/pve-lv_data /dev/mapper/pve-lv_data internal --set-defaults --create-device' terminated with exit code 10

root@proxmox001:~# service drbd stop
Stopping all DRBD resources:/dev/drbd0: State change failed: (-12) Device is held open by someone
ERROR: Module drbd is in use
.
root@proxmox001:~# drbdadm detach r0
0: State change failed: (-2) Need access to UpToDate data
Command 'drbdsetup 0 detach' terminated with exit code 17

root@proxmox001:~# vgchange -an /dev/drbdvg
  0 logical volume(s) in volume group "drbdvg" now active

root@proxmox001:~# service drbd stop
Stopping all DRBD resources:.

root@proxmox001:~# service drbd start
Starting DRBD resources:[ d(r0) s(r0) n(r0) ].

但我总会遇到生产中断的情况……

您有解决方案来绕过这个问题吗?

非常感谢 !

编辑 :

如果我将 global_common 文件配置中的磁盘块替换为:

disk {
    fencing resource-only;
}

如果我在任何主机上都有正在运行的虚拟机,我只需要重新启动 drbd 即可重建关联同步链接。

但是,如果我在两个或所有主机上都有正在运行的虚拟机,那么我已经与顶部线程相同了。

谢谢

相关内容