无法使 DRBD + TLS 工作

无法使 DRBD + TLS 工作

Debian Bookworm 12.5 DRBD 和来自 proxmox 的 ktls-utils 软件包。一直都很好,直到我启用了 tls。

ii  drbd-dkms                        9.2.8-2                        all
ii  drbd-reactor                     1.4.0-1                        amd64
ii  drbd-utils                       9.27.0-1                       amd64
ii  ktls-utils                       0.10-6                         amd64

3 个节点上都有 drbd,全部同步,运行良好。

r0 role:Primary
  disk:UpToDate
  slave role:Secondary
    peer-disk:UpToDate
  tiebreaker role:Secondary
    peer-disk:UpToDate

{ tls yes; }然后我在所有这些节点上添加了网络。

收到:

dmesg

[  147.759827] drbd r0 slave: conn( Connecting -> NetworkFailure ) [disconnected]
[  147.759848] drbd r0 slave: Terminating sender thread
[  147.759853] drbd r0 slave: Starting sender thread (from drbd_r_r0 [755])
[  147.771131] drbd r0 slave: Connection closed
[  147.771135] drbd r0 slave: helper command: /sbin/drbdadm disconnected
[  147.771587] drbd r0 slave: helper command: /sbin/drbdadm disconnected exit code 0
[  147.771592] drbd r0 slave: conn( NetworkFailure -> Unconnected ) [disconnected]
[  147.977813] drbd r0 tcp:tiebreaker: dtt_send_page: size=80 len=80 sent=-95
[  147.977817] drbd r0 tiebreaker: conn( Connecting -> NetworkFailure ) [disconnected]
[  147.977824] drbd r0 tiebreaker: Terminating sender thread
[  147.977829] drbd r0 tiebreaker: Starting sender thread (from drbd_r_r0 [759])
[  147.987330] drbd r0 tiebreaker: Connection closed
[  147.987335] drbd r0 tiebreaker: helper command: /sbin/drbdadm disconnected
[  147.987744] drbd r0 tiebreaker: helper command: /sbin/drbdadm disconnected exit code 0
[  147.987749] drbd r0 tiebreaker: conn( NetworkFailure -> Unconnected ) [disconnected]
[  148.783089] drbd r0 slave: conn( Unconnected -> Connecting ) [connecting]

ngrep -d any port 7788

interface: any
filter: ( port 7788 ) and (ip || ip6)
####
T 192.168.0.X:52605 -> 192.168.0.Y:7788 [AP] #4
  ....Wf4Q.(.....\.D....x:.0.Y.k 2[..... <SKIPPED MANY LINES>
#####
T 192.168.0.Y:52571 -> 192.168.0.X:7788 [AP] #9
  <SKIPPED MANY LINES>

tcpdump port 7788

09:11:21.927899 IP 192.168.0.Z.39381 > 192.168.0.Y.7788: Flags [.], ack 2313, win 501, options [nop,nop,TS val 2232236805 ecr 466841460], length 0

journalctl -f -u tlshd

Apr 10 10:18:12 master tlshd[8385]: Handshake with slave ... was successful
Apr 10 10:18:12 master tlshd[8386]: Handshake with tiebreaker ... was successful
^^ fine on all nodes.

在详细节点上还有以下内容:

DBG<1>././lib/cache_mngt.c:302  nl_cache_mngt_unregister: Unregistered cache operations genl/family

我有如下生成的证书:使用 DRBD 进行加密复制 - LINBIThttps://linbit.com/blog/encrypted-replication-with-drbd/只是修复了 CN 以匹配主机名,还添加了 IP 以防万一(尝试了各种版本,没有运气)。证书是在 macos 上使用 openssl 实用程序生成的。

-addext "subjectAltName = DNS:$domain, IP:$ip" \
-addext "extendedKeyUsage = serverAuth, clientAuth" \

收到:

r0 role:Secondary
  disk:UpToDate quorum:no
  slave connection:Connecting
  tiebreaker connection:NetworkFailure

r0 role:Secondary
  disk:UpToDate quorum:no
  slave connection:Unconnected
  tiebreaker connection:Unconnected

我已禁用 cram-hmac-alg、data-integrity-alg 和 shared-secret,以防万一,只使用“tls yes”来保持网络部分干净,但没有成功。这是一个简单的资源配置,包含 3 个节点和 connection-mesh + tls yes。

我是不是忘记添加什么东西来使一切变得完整?我在本地虚拟环境中运行它,没有网络问题。配置出了点问题,我不知道是什么。

r0.res

resource r0 {
        options {
            auto-promote no;
            on-suspended-primary-outdated force-secondary;
            quorum majority;
            on-no-quorum io-error;
            quorum-minimum-redundancy 2;
        }

        device /dev/drbd0;
        meta-disk        internal;
        disk /dev/sdb1;

        protocol B;

        startup {
                wfc-timeout  15;
                degr-wfc-timeout 60;
        }
        net {
                tls yes;
        }
        on master {
        address 192.168.0.X:7788;
                node-id 0;
        }
        on slave {
        address 192.168.0.Y:7788;
                node-id 1;
        }
        on tiebreaker {
        address 192.168.0.Z:7788;
                node-id 2;
        }
    connection-mesh {
        hosts master slave tiebreaker;
    }
}

更多 tlshd 日志:

pr 11 09:37:19 master tlshd[1008]: Parsing a valid netlink message
Apr 11 09:37:19 master tlshd[1008]: No peer identities found
Apr 11 09:37:19 master tlshd[1008]: No certificates found
Apr 11 09:37:19 master tlshd[1008]: DBG<2>       ././lib/msg.c:572  nlmsg_free: msg 0x5573416ff750: Freed
Apr 11 09:37:19 master tlshd[1008]: DBG<2>       ././lib/msg.c:572  nlmsg_free: msg 0x5573416ff610: Freed
Apr 11 09:37:19 master tlshd[1008]: System config file: /etc/gnutls/config
Apr 11 09:37:19 master tlshd[1008]: Server x.509 truststore is /etc/tlshd.d/myCA.crt
Apr 11 09:37:19 master tlshd[1008]: System trust: Loaded 1 certificate(s).
Apr 11 09:37:19 master tlshd[1008]: Retrieved x.509 certificate from /etc/tlshd.d/master.crt
Apr 11 09:37:19 master tlshd[1008]: Retrieved private key from /etc/tlshd.d/master.key
Apr 11 09:37:19 master tlshd[1008]: gnutls(2): checking 13.02 (GNUTLS_AES_256_GCM_SHA384) for compatibility
Apr 11 09:37:19 master tlshd[1008]: gnutls(2): Selected (RSA) cert
Apr 11 09:37:19 master tlshd[1008]: gnutls(2): EXT[0x557341710040]: server generated SECP384R1 shared key
Apr 11 09:37:19 master tlshd[1009]: Server's trusted authorities:
Apr 11 09:37:19 master tlshd[1009]:    [0]: CN=MyCN
Apr 11 09:37:19 master tlshd[1009]: The certificate is trusted.
Apr 11 09:37:19 master tlshd[1009]: The peer offered 1 certificate(s).
Apr 11 09:37:19 master tlshd[1009]: Session description: (TLS1.3)-(ECDHE-SECP384R1)-(RSA-PSS-RSAE-SHA384)-(AES-256-GCM)
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:277  __nlmsg_alloc: msg 0x5573416ff610: Allocated new message, maxlen=4096
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:517  nlmsg_put: msg 0x5573416ff610: Added netlink header type=16, flags=0, pid=0, seq=0
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:424  nlmsg_reserve: msg 0x5573416ff610: Reserved 4 (4) bytes, pad=4, nlmsg_len=20
Apr 11 09:37:19 master tlshd[1009]: DBG<2> ././lib/genl/genl.c:357  genlmsg_put: msg 0x5573416ff610: Added generic netlink header cmd=3 version=1
Apr 11 09:37:19 master tlshd[1009]: DBG<2>      ././lib/attr.c:470  nla_reserve: msg 0x5573416ff610: attr <0x557341703164> 2: Reserved 16 (10) bytes at offset +4 nlmsg_len=36
Apr 11 09:37:19 master tlshd[1009]: DBG<2>      ././lib/attr.c:507  nla_put: msg 0x5573416ff610: attr <0x557341703164> 2: Wrote 10 bytes at offset +4
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:277  __nlmsg_alloc: msg 0x5573416ff750: Allocated new message, maxlen=164
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:572  nlmsg_free: msg 0x5573416ff750: Freed
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:277  __nlmsg_alloc: msg 0x55734170cac0: Allocated new message, maxlen=36
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:572  nlmsg_free: msg 0x55734170cac0: Freed
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:572  nlmsg_free: msg 0x5573416ff610: Freed
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:277  __nlmsg_alloc: msg 0x5573416ff610: Allocated new message, maxlen=4096
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:517  nlmsg_put: msg 0x5573416ff610: Added netlink header type=32, flags=0, pid=0, seq=0
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:424  nlmsg_reserve: msg 0x5573416ff610: Reserved 4 (4) bytes, pad=4, nlmsg_len=20
Apr 11 09:37:19 master tlshd[1009]: DBG<2> ././lib/genl/genl.c:357  genlmsg_put: msg 0x5573416ff610: Added generic netlink header cmd=3 version=0
Apr 11 09:37:19 master tlshd[1009]: DBG<2>      ././lib/attr.c:470  nla_reserve: msg 0x5573416ff610: attr <0x557341703164> 1: Reserved 8 (4) bytes at offset +4 nlmsg_len=28
Apr 11 09:37:19 master tlshd[1009]: DBG<2>      ././lib/attr.c:507  nla_put: msg 0x5573416ff610: attr <0x557341703164> 1: Wrote 4 bytes at offset +4
Apr 11 09:37:19 master tlshd[1009]: DBG<2>      ././lib/attr.c:470  nla_reserve: msg 0x5573416ff610: attr <0x55734170316c> 2: Reserved 8 (4) bytes at offset +12 nlmsg_len=36
Apr 11 09:37:19 master tlshd[1009]: DBG<2>      ././lib/attr.c:507  nla_put: msg 0x5573416ff610: attr <0x55734170316c> 2: Wrote 4 bytes at offset +12
Apr 11 09:37:19 master tlshd[1009]: DBG<2>      ././lib/attr.c:470  nla_reserve: msg 0x5573416ff610: attr <0x557341703174> 3: Reserved 8 (4) bytes at offset +20 nlmsg_len=44
Apr 11 09:37:19 master tlshd[1009]: DBG<2>      ././lib/attr.c:507  nla_put: msg 0x5573416ff610: attr <0x557341703174> 3: Wrote 4 bytes at offset +20
Apr 11 09:37:19 master tlshd[1009]: DBG<2>       ././lib/msg.c:572  nlmsg_free: msg 0x5573416ff610: Freed
Apr 11 09:37:19 master tlshd[1009]: Handshake with slave (192.168.0.Y) was successful
Apr 11 09:37:19 master tlshd[1009]: DBG<1>././lib/cache_mngt.c:302  nl_cache_mngt_unregister: Unregistered cache operations genl/family

答案1

好吧,这是非常具有欺骗性的日志消息,tlshd它导致我认为有问题,drbd但实际上它是内核版本。我已经从 Debian Backports 存储库安装了 6.6.13,它现在可以正常工作了。

相关内容