我曾多次在小型 Proxmox 集群(=带有附加包的 Debian)中进行故障转移设置。由于没有好的文档,我发布了这个问题,这是我的答案 :-)
想法:应建立一个单独的存储和服务网络,并具备故障转移功能,以防其中一个交换机发生故障或处于维护状态。在服务网络中,我们希望使用 VLAN 进一步隔离流量。
- 为每个网络使用主动备份模式的绑定(bond0、bond1)
- 每个绑定都有一个主网络接口,流量应以常规模式通过该接口传输(iface A、iface B)
- 在故障转移场景中,使用另一个网络;由于存储和服务网络都已连接,因此 ARP 数据包将找到所需的端点
|---------------[ storage switch ]
| x x x x
| | | | |
failover | | | |
link x x x x
| iface A iface A iface A iface A
| [ Node 1 ] [ Node 2 ] [ Node 3 ] [ Node X ]
| iface B iface B iface B iface B
| x x x x
| | | | |
| | | | |
| x x x x
|---------------[ services switch ]
- 现在有趣的是,如何在同一界面上并行创建两个键?解决方案:
- 在 iface A、iface B 上使用 VLAN,并将 VLAN 绑定在一起
- 使用流量整形(tc)
我尝试了两种解决方案来使它们运行 - 但只有第一种成功了:
为两个接口创建 VLAN
- iface A.100
- iface A.101
- iface B.100
- iface B.101
在 VLAN 上创建 Bond
- 奴隶 iface A.100
- 奴隶 iface B.100
- 债券1
- 奴隶 iface A.101
- 奴隶 iface B.101
在 Bond 上创建 VLAN - 您现在有了 Q-in-Q
- 债券1.5000
- 债券1.XXX
我的挑战是要理解将 bond-XXX 参数放在哪里;它必须位于作为 bond 一部分的第一个接口中(在我的例子中是 ifaceA.1000),其中描述了所有 miimon、up- 和 downdelay。现在使用 cat /proc/net/bonding/bond0 检查:
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: fault-tolerance (active-backup)
Primary Slave: ifaceA.100 (primary_reselect always)
Currently Active Slave: ifaceA.100
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 200
Down Delay (ms): 200
Slave Interface: ifaceA.100
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: XX:XX:XX:XX:XX:XX
Slave queue ID: 0
Slave Interface: ifaceB.101
MII Status: up
Speed: 10000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: YY:YY:YY:YY:YY:YY
Slave queue ID: 0
这是我的 /etc/network/interfaces 文件:
iface lo inet loopback
auto vmbr0
iface vmbr0 inet static
# your usual proxmox mgmt interface
address A.B.C.D
gateway A.B.C.1
bridge_ports eth0
bridge_stp off
bridge_fd 0
# Proxmox Mgmt bridge
auto ifaceA
iface ifaceA inet manual
mtu 9100
#Storage net
auto ifaceB
iface ifaceB inet manual
mtu 9100
#Service net
auto ifaceA.100
iface ifaceA.100 inet manual
bond-master bond0
bond-primary ifaceA.100
bond-miimon 100
bond-updelay 200
bond-downdelay 200
bond-mode active-backup
mtu 9048
#Primary leg of storage bond0
auto ifaceA.101
iface ifaceA.101 inet manual
bond-master bond1
bond-miimon 100
bond-updelay 200
bond-downdelay 200
bond-mode active-backup
mtu 9048
#Secondary leg of services
auto ifaceB.100
iface ifaceB.100 inet manual
bond-miimon 100
bond-updelay 200
bond-downdelay 200
bond-master bond0
bond-mode active-backup
mtu 9048
#Secondary leg of services
auto ifaceB.101
iface ifaceB.101 inet manual
bond-master bond1
bond-primary ifaceB.101
bond-miimon 100
bond-updelay 200
bond-downdelay 200
bond-mode active-backup
mtu 9048
#Primary leg of services
auto bond0
iface bond0 inet static
address W.X.Y.Z
bond-mode active-backup
bond-primary ifaceA.100
mtu 9048
#Storage for Ceph (pveceph init --network W.X.Y.0/24)
auto bond1
iface bond1 inet static
address Q.P.O.R
bond-mode active-backup
bond-primary ifaceB.101
mtu 9048
#Services/Corosync bond (pvecm create MYCLUSTER --bindnet0_addr Q.P.O.R --ring0_addr static-hostname-for-this-node)
auto bond1.5000
iface bond1.5000 inet manual
mtu 9000
# bond1 services on VLAN 5000, has no IP bound to it
auto vmbr5000
iface vmbr5000 inet manual
bridge-ports bond1.5000
bridge-stp off
bridge-fd 0
mtu 9000
# bond1.5000 services, which can be consumed within a VM
# AND ... more of the same
auto bond1.XXX
iface bond1.XXX inet manual
mtu 9000
auto vmbrXXX
iface vmbrXXX inet manual
bridge-ports bond1.XXX
bridge-stp off
bridge-fd 0
mtu 9000
- 启用流量整形
- 为绑定的每个分支指定一个队列 ID
- 尝试匹配 VLAN 标签以覆盖队列以使用特定航段
iface lo inet loopback
auto ifaceA
iface ifaceA inet manual
bond-mode active-backup
bond-master bond0
bond-primary ifaceB
bond-miimon 100
bond-updelay 200
bond-downdelay 200
mtu 9100
auto ifaceB
iface ifaceB inet manual
bond-mode active-backup
bond-master bond0
bond-primary ifaceB
bond-miimon 100
bond-updelay 200
bond-downdelay 200
mtu 9100
# Choose the second interface as the default primary
auto bond0
iface bond0 inet static
address Q.O.P.R
bond-mode active-backup
bond-primary ifaceB
bond-miimon 100
bond-updelay 200
bond-downdelay 200
mtu 9100
post-up echo "ifaceA:2" > /sys/class/net/bond0/bonding/queue_id
post-up echo "ifaceB:3" > /sys/class/net/bond0/bonding/queue_id
post-up tc qdisc add dev bond0 handle 1 root multiq
#Bond over both
auto bond0.5000
iface bond0.5000 inet static
address H.I.K.L
mtu 9000
post-up tc filter add dev bond0 basic match 'meta(vlan mask 0xfffd eq 0x1388)' action skbedit queue_mapping 2
# Should go over iface A
auto bond0.XXX
iface bond0.XXX inet static
address K.L.M.N
mtu 9000
post-up tc filter add dev bond0 basic match 'meta(vlan mask 0xffff eq 0xVLAN-ID-as-Oct)' action skbedit queue_mapping 3
# Should go over iface B