将工作节点加入 Kubernetes 集群永远挂起

将工作节点加入 Kubernetes 集群永远挂起

我已经使用命令创建了我的 Kubernetes 集群

sudo kubeadm init --pod-network-cidr 192.168.0.0/16

我在控制平面节点上安装了 Calico 网络插件

kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/calico.yaml

我有两个工作节点服务器,我正在尝试使用命令将它们加入到我的集群

sudo kubeadm join IP_OF_MY_SERVER:6443 --token ... --discovery-token-ca-cert-hash sha256:...

但它永远挂起,什么也没发生。这发生在两个工作节点服务器上。我的工作节点具有完全连接性,我可以访问互联网,并且可以通过 IP 和主机名访问我的控制平面节点。我的集群处于活动状态。

kubectl get nodes
NAME          STATUS   ROLES           AGE     VERSION
k8s-control   Ready    control-plane   5h12m   v1.26.1
kubectl cluster-info
Kubernetes control plane is running at https://172.31.97.251:6443
CoreDNS is running at https://172.31.97.251:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

我在 Ubuntu 22.04.2 LTS Jammy、containerd 版本 1.6 上运行集群和工作节点

containerd --version
containerd containerd.io 1.6.18 2456e983eb9e37e47538f59ea18f2043c9a73640

kubelet 版本 1.26.1

kubelet --version
Kubernetes v1.26.1

kubectl 版本 1.26

Client Version: version.Info Major:"1", Minor:"26"

我的 containerd 已启动并运行systemd

sudo systemctl status containerd
● containerd.service - containerd container runtime
     Loaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)
     Active: active (running) since Mon 2023-02-27 21:40:01 UTC; 52min ago

我错过了什么?

更新 我检查了工作节点上的系统日志,这就是我看到的内容

sudo cat /var/log/syslog
Feb 27 22:19:17 k8s-worker-1 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Feb 27 22:19:17 k8s-worker-1 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Feb 27 22:19:27 k8s-worker-1 systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 187.
Feb 27 22:19:27 k8s-worker-1 systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
Feb 27 22:19:27 k8s-worker-1 systemd[1]: Started kubelet: The Kubernetes Node Agent.
Feb 27 22:19:27 k8s-worker-1 kubelet[9361]: E0227 22:19:27.903339    9361 run.go:74] "command failed" err="failed to validate kubelet flags: the container runtime endpoint address was not specified or empty, use --container-runtime-endpoint to set"
Feb 27 22:19:27 k8s-worker-1 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Feb 27 22:19:27 k8s-worker-1 systemd[1]: kubelet.service: Failed with result 'exit-code'.

答案1

看起来你需要明确告诉 kubelet 在哪里找到 containerd 套接字。在/etc/systemd/system/kubelet.service,将此行作为参数添加到 kubelet 可执行文件中(通常ExecStart=/usr/local/bin/kubelet):

    --container-runtime-endpoint unix:///run/containerd/containerd.sock

验证 containerd.sock 的位置。如果它不在 /run/containerd.containerd.sock 中,您可以通过查看/etc/containerd/config.toml归档于

[grpc]
address = "/run/containerd/containerd.sock"

如果 kubelet.service 文件不在该位置,可以通过运行systemctl status kubelet并查看已加载:线。

最后,systemctl restart kubelet

相关内容