添加第二个 master 后 master 发生故障

添加第二个 master 后 master 发生故障

在 Virtualbox 下运行。5 台机器:2 个工作节点,但我甚至还没有到达那一步。1x 负载均衡器,ubuntu 运行 haproxy,在 192.168.20.10 上,配置如下:

frontend kubernetes-frontend
        bind 0.0.0.0:6443
        mode tcp
        option tcplog
        default_backend kubernetes-backend

    backend kubernetes-backend
        mode tcp
        option tcplog
        option tcp-check
        balance roundrobin
        default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 weight 100
        server kubernetes-master-1 192.168.20.21:6443 check
        server kubernetes-master-2 192.168.20.22:6443 check

2x 主节点,完整副本。kubeadm v 1.19.4,docker 19.03。crio 1.17,Kubernetes v1.19.4。

kubernetes-master-1 192.168.20.21

kubernetes-master-2 192.168.20.22

运行 init 命令

sudo kubeadm init --control-plane-endpoint="192.168.20.10:6443" --upload-certs --apiserver-advertise-address=192.168.20.21 --pod-network-cidr=10.100.0.0/16

成功

Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of the control-plane node running the following command on each as root:

  kubeadm join 192.168.20.10:6443 --token c2p4af.9s3aapujrfjkjlho \
    --discovery-token-ca-cert-hash sha256:ff3fc8d5e1a7ee16e2d48362cef4e3fa53df4c8fd672e69c8fe2c9e5826ab0c9 \
    --control-plane --certificate-key 57d92a387afbd601fba5da9e310523fa5ac8dfcdf0fd70dd8624a9950ce06457

Please note that the certificate-key gives access to cluster sensitive data, keep it secret!
As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use
"kubeadm init phase upload-certs --upload-certs" to reload certs afterward.

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.20.10:6443 --token c2p4af.9s3aapujrfjkjlho \
    --discovery-token-ca-cert-hash sha256:ff3fc8d5e1a7ee16e2d48362cef4e3fa53df4c8fd672e69c8fe2c9e5826ab0c9 

(全输出这里

到目前为止,一切都很好,但是当我在 master2 上运行 join 命令时,

[etcd] Creating static Pod manifest for "etcd"

(全输出这里)它再输出一行, [etcd] Waiting for the new etcd member to join the cluster. This can take up to 40s 然后

[kubelet-check] Initial timeout of 40s passed.

就这样。master-1(之前响应过)正在响应

kubectl cluster-info

像这样:

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Error from server: etcdserver: request timed out

建议的命令返回以下输出:

kubectl cluster-info dump
Error from server (InternalError): an error on the server ("") has prevented the request from succeeding (get nodes)

就是这样。无论我之前是否安装过网络(我使用的是 calico),我都会得到相同的结果。相同的图像适用于单主服务器,我可以添加节点并运行命令。但是,无论我遵循哪个指南,这总是会失败。我已经检查了 etcd(在 master 1 上),它在 master-2 上执行连接之前正在运行(或正在运行)。它还在监听正确的地址(192.168.20.21),而不是 localhost。

任何帮助都非常感谢!谢谢!

答案1

好的!所以这一切都通过向第二个主服务器添加 --apiserver-advertise-address=192.168.20.22 来解决。天哪。所以当你在辅助服务器上使用 join 命令时,请确保添加

--apiserver-advertise-address=

以及该服务器的地址,不是第一个主人,但是掌握。

相关内容