从失败的 etcd2 / CoreOS 集群中恢复

Question 1

配置文件存在于该位置

/var/lib/waagent/CustomData

使用

sudo vim /var/lib/waagent/CustomData

您应该能够编辑它。重启后配置将生效。

Answer

配置文件存在于该位置

/var/lib/waagent/CustomData

使用

sudo vim /var/lib/waagent/CustomData

您应该能够编辑它。重启后配置将生效。

Question 2

您可以尝试修改 etcd 服务定义/run/systemd/system/etcd.service.d/20-cloudinit.conf- 您应该会看到类似

[Service]
Environment="ETCD_ADDR=10.1.1.1:4001"
Environment="ETCD_DISCOVERY=https://discovery.etcd.io/47fabddb4eed191a09bf5b70ba93426a"
Environment="ETCD_PEER_ADDR=10.1.1.1:7001"

修改发现 URL 为新的，然后重新启动

systemctl daemon-reload
systemctl restart etcd

不过，您需要测试它是否可以在 Azure 重启后继续存在！

Answer

您可以尝试修改 etcd 服务定义/run/systemd/system/etcd.service.d/20-cloudinit.conf- 您应该会看到类似

[Service]
Environment="ETCD_ADDR=10.1.1.1:4001"
Environment="ETCD_DISCOVERY=https://discovery.etcd.io/47fabddb4eed191a09bf5b70ba93426a"
Environment="ETCD_PEER_ADDR=10.1.1.1:7001"

修改发现 URL 为新的，然后重新启动

systemctl daemon-reload
systemctl restart etcd

不过，您需要测试它是否可以在 Azure 重启后继续存在！

Question 3

如果你在三节点集群中删除两个节点，则会失去仲裁；如果是 3 个节点，则只会丢失一个节点。有关 CoreOS 容错功能的更多信息：

Fault Tolerance Table

It is recommended to have an odd number of members in a cluster. Having an odd cluster size doesn't change the number needed for majority, but you gain a higher tolerance for failure by adding the extra member. You can see this in practice when comparing even and odd sized clusters:
Cluster Size    Majority    Failure Tolerance
1   1   0
3   2   1
4   3   1
5   3   2
6   4   2
7   4   3
8   5   3
9   5   4

https://coreos.com/etcd/docs/latest/admin_guide.html

Answer

如果你在三节点集群中删除两个节点，则会失去仲裁；如果是 3 个节点，则只会丢失一个节点。有关 CoreOS 容错功能的更多信息：

Fault Tolerance Table

It is recommended to have an odd number of members in a cluster. Having an odd cluster size doesn't change the number needed for majority, but you gain a higher tolerance for failure by adding the extra member. You can see this in practice when comparing even and odd sized clusters:
Cluster Size    Majority    Failure Tolerance
1   1   0
3   2   1
4   3   1
5   3   2
6   4   2
7   4   3
8   5   3
9   5   4

https://coreos.com/etcd/docs/latest/admin_guide.html

从失败的 etcd2 / CoreOS 集群中恢复

答案1

答案2

答案3

相关内容