kube-apiserver 消失了

kube-apiserver 消失了

系统详细信息:

  • 库伯内特版本:1.20.1
  • 主机操作系统:Ubuntu 20.04.4 LTS

我今天在我的 Kubernetes 实验室(在 VirtualBox 上)启动了它,任何kubectl命令都会像这样结束:

$ kubectl get nodes -o wide
The connection to the server master:6443 was refused - did you specify the right host or port?

$ kubectl get po --all-namespaces
The connection to the server master:6443 was refused - did you specify the right host or port?

事实上,socket master:6443 上没有任何监听:

$ ss -tl4np
State                           Recv-Q                          Send-Q                                                     Local Address:Port                                                      Peer Address:Port                          Process
LISTEN                          0                               4096                                                           127.0.0.1:34627                                                          0.0.0.0:*
LISTEN                          0                               4096                                                           127.0.0.1:10248                                                          0.0.0.0:*
LISTEN                          0                               4096                                                       192.168.1.190:2379                                                           0.0.0.0:*
LISTEN                          0                               4096                                                           127.0.0.1:2379                                                           0.0.0.0:*
LISTEN                          0                               4096                                                       192.168.1.190:2380                                                           0.0.0.0:*
LISTEN                          0                               4096                                                           127.0.0.1:2381                                                           0.0.0.0:*
LISTEN                          0                               4096                                                           127.0.0.1:33133                                                          0.0.0.0:*
LISTEN                          0                               4096                                                           127.0.0.1:10257                                                          0.0.0.0:*
LISTEN                          0                               4096                                                           127.0.0.1:10259                                                          0.0.0.0:*
LISTEN                          0                               4096                                                       127.0.0.53%lo:53                                                             0.0.0.0:*
LISTEN                          0                               128                                                              0.0.0.0:22                                                             0.0.0.0:*

ccd@master:~$ curl https://master:6443
curl: (7) Failed to connect to master port 6443: Connection refused

问题:

  1. 如何启动 kube-apiserver?
  2. 为什么 kube-apiserver 突然消失了?上次我在这个实验室时它运行得很好。

旁注:kubelet 服务正在运行,但处于错误状态。

Jan 17 20:40:43 master kubelet[613]: E0117 20:40:43.451465     613 kubelet.go:2422] "Error getting node" err="node \"master\" not found"
Jan 17 20:40:43 master kubelet[613]: E0117 20:40:43.552175     613 kubelet.go:2422] "Error getting node" err="node \"master\" not found"

[更新]

在我的实验室中,kube-apiserver 似乎作为 docker 容器运行:

root@master:~# docker ps
CONTAINER ID   IMAGE                  COMMAND                  CREATED          STATUS          PORTS     NAMES
0adc676e6077   595f327f224a           "kube-scheduler --au…"   10 minutes ago   Up 10 minutes             k8s_kube-scheduler_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_7
8e5200bd131f   25f8c7f3da61           "etcd --advertise-cl…"   10 minutes ago   Up 10 minutes             k8s_etcd_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_7
7ad4134212ce   df7b72818ad2           "kube-controller-man…"   10 minutes ago   Up 10 minutes             k8s_kube-controller-manager_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_7
6baca0741b74   k8s.gcr.io/pause:3.6   "/pause"                 10 minutes ago   Up 10 minutes             k8s_POD_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_7
9a5572813b8a   k8s.gcr.io/pause:3.6   "/pause"                 10 minutes ago   Up 10 minutes             k8s_POD_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_7
daca7dae837c   k8s.gcr.io/pause:3.6   "/pause"                 10 minutes ago   Up 10 minutes             k8s_POD_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_7
10871d3a6e66   k8s.gcr.io/pause:3.6   "/pause"                 10 minutes ago   Up 10 minutes             k8s_POD_kube-apiserver-master_kube-system_01ecc291decf2feb7dae990cc7eb8cb6_7

在命令列中显示“/pause”。我尝试取消暂停,但没有成功。 :(

root@master:~# docker unpause 10871d3a6e66
Error response from daemon: Container 10871d3a6e66c897af6cfa33ea4c45668045ab15bb29beab938556937589e3ad is not paused

docker logs 10871d3a6e66命令报告:

Shutting down, got signal: Terminated

[更新] 我实际上可以看到 kube-apiserver 尝试启动,但它似乎每次尝试都会崩溃。查看列表中的第一个容器。

    $ sudo docker ps -a
CONTAINER ID   IMAGE                  COMMAND                  CREATED          STATUS                      PORTS     NAMES
bed0b4e9170c   8fa62c12256d           "kube-apiserver --ad…"   39 seconds ago   Exited (1) 14 seconds ago             k8s_kube-apiserver_kube-apiserver-master_kube-system_01ecc291decf2feb7dae990cc7eb8cb6_9
fda323106b5f   25f8c7f3da61           "etcd --advertise-cl…"   7 minutes ago    Up 7 minutes                          k8s_etcd_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_4
a05557b1d66b   k8s.gcr.io/pause:3.6   "/pause"                 7 minutes ago    Up 7 minutes                          k8s_POD_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_4
0befad0f3003   df7b72818ad2           "kube-controller-man…"   8 minutes ago    Up 8 minutes                          k8s_kube-controller-manager_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_4
8f3b2a3756a9   595f327f224a           "kube-scheduler --au…"   8 minutes ago    Up 8 minutes                          k8s_kube-scheduler_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_4
3aeb39d118d3   k8s.gcr.io/pause:3.6   "/pause"                 8 minutes ago    Up 8 minutes                          k8s_POD_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_4
8b0c53496168   k8s.gcr.io/pause:3.6   "/pause"                 8 minutes ago    Up 8 minutes                          k8s_POD_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_4
bd5f13dee325   k8s.gcr.io/pause:3.6   "/pause"                 8 minutes ago    Up 8 minutes                          k8s_POD_kube-apiserver-master_kube-system_01ecc291decf2feb7dae990cc7eb8cb6_4
6c081ae55883   fd1608dbbc19           "start_runit"            9 months ago     Exited (0) 9 months ago               k8s_calico-node_calico-node-q5cpc_kube-system_27708358-127b-4078-9b42-e4953190ed80_3
d5acca3521c4   a4ca41631cc7           "/coredns -conf /etc…"   9 months ago     Exited (0) 9 months ago               k8s_coredns_coredns-64897985d-rmgtq_kube-system_2e772c34-7c77-4862-bab0-93be9f371095_3
4b3b0dca6679   a1a88662416b           "/usr/bin/kube-contr…"   9 months ago     Exited (2) 9 months ago               k8s_calico-kube-controllers_calico-kube-controllers-7c845d499-95nlh_kube-system_bbb4637d-e143-46e5-97df-4ebc36c455b3_3
428d125b24d9   a4ca41631cc7           "/coredns -conf /etc…"   9 months ago     Exited (0) 9 months ago               k8s_coredns_coredns-64897985d-dwrpn_kube-system_5492f4f3-cc4d-4461-b57f-bb03f8fbc6a2_3
78a0ecbd465b   d6660bf471e1           "/usr/local/bin/flex…"   9 months ago     Exited (0) 9 months ago               k8s_flexvol-driver_calico-node-q5cpc_kube-system_27708358-127b-4078-9b42-e4953190ed80_0
03a6837fc5bd   k8s.gcr.io/pause:3.6   "/pause"                 9 months ago     Exited (0) 9 months ago               k8s_POD_calico-kube-controllers-7c845d499-95nlh_kube-system_bbb4637d-e143-46e5-97df-4ebc36c455b3_12
da8ae7c12d4f   k8s.gcr.io/pause:3.6   "/pause"                 9 months ago     Exited (0) 9 months ago               k8s_POD_coredns-64897985d-rmgtq_kube-system_2e772c34-7c77-4862-bab0-93be9f371095_12
c6afacd9d088   k8s.gcr.io/pause:3.6   "/pause"                 9 months ago     Exited (0) 9 months ago               k8s_POD_coredns-64897985d-dwrpn_kube-system_5492f4f3-cc4d-4461-b57f-bb03f8fbc6a2_12
cbdcedf7183a   be7dfc21ba2e           "/opt/cni/bin/install"   9 months ago     Exited (0) 9 months ago               k8s_install-cni_calico-node-q5cpc_kube-system_27708358-127b-4078-9b42-e4953190ed80_0
0d1c5aeee222   be7dfc21ba2e           "/opt/cni/bin/calico…"   9 months ago     Exited (0) 9 months ago               k8s_upgrade-ipam_calico-node-q5cpc_kube-system_27708358-127b-4078-9b42-e4953190ed80_1
d5c211817c38   4c0375452406           "/usr/local/bin/kube…"   9 months ago     Exited (2) 9 months ago               k8s_kube-proxy_kube-proxy-dmhv4_kube-system_ae5b3e51-be54-41d0-8aeb-4049756a0e0a_3
27534d78c56f   k8s.gcr.io/pause:3.6   "/pause"                 9 months ago     Exited (0) 9 months ago               k8s_POD_calico-node-q5cpc_kube-system_27708358-127b-4078-9b42-e4953190ed80_3
95876ad85c66   k8s.gcr.io/pause:3.6   "/pause"                 9 months ago     Exited (0) 9 months ago               k8s_POD_kube-proxy-dmhv4_kube-system_ae5b3e51-be54-41d0-8aeb-4049756a0e0a_3
ba442d9c0cb0   595f327f224a           "kube-scheduler --au…"   9 months ago     Exited (0) 9 months ago               k8s_kube-scheduler_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_3
14cc628ce7ee   df7b72818ad2           "kube-controller-man…"   9 months ago     Exited (2) 9 months ago               k8s_kube-controller-manager_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_3
7aca9140ee03   25f8c7f3da61           "etcd --advertise-cl…"   9 months ago     Exited (0) 9 months ago               k8s_etcd_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_3
f220f955a404   k8s.gcr.io/pause:3.6   "/pause"                 9 months ago     Exited (0) 9 months ago               k8s_POD_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_3
c16d35dba1fb   k8s.gcr.io/pause:3.6   "/pause"                 9 months ago     Exited (0) 9 months ago               k8s_POD_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_3
80e581f9529c   k8s.gcr.io/pause:3.6   "/pause"                 9 months ago     Exited (0) 9 months ago               k8s_POD_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_3
8cc54605041f   k8s.gcr.io/pause:3.6   "/pause"                 21 months ago    Exited (0) 21 months ago              k8s_POD_coredns-64897985d-dwrpn_kube-system_5492f4f3-cc4d-4461-b57f-bb03f8fbc6a2_11
666788dcedf5   k8s.gcr.io/pause:3.6   "/pause"                 21 months ago    Exited (0) 21 months ago              k8s_POD_calico-kube-controllers-7c845d499-95nlh_kube-system_bbb4637d-e143-46e5-97df-4ebc36c455b3_11
84046839fc1b   k8s.gcr.io/pause:3.6   "/pause"                 21 months ago    Exited (0) 21 months ago              k8s_POD_coredns-64897985d-rmgtq_kube-system_2e772c34-7c77-4862-bab0-93be9f371095_11

我也尝试像这样手动运行它,但这不起作用:

$ sudo docker run -it 8fa62c12256d
2024/01/22 12:57:56 not enough arguments to run

[解决方法]

我找不到解决我的问题的方法。因为这只是我的 VirtualBox LAB 系统(并且我不介意丢失数据/项目),我已经重新启动了它(集群)。

  1. 重置主节点:
sudo su -
kubeadm reset -f
rm -rf /etc/cni /etc/kubernetes /var/lib/dockershim /var/lib/etcd /var/lib/kubelet /var/run/kubernetes ~/.kube/*

iptables -F && iptables -X
iptables -t nat -F && iptables -t nat -X
iptables -t raw -F && iptables -t raw -X
iptables -t mangle -F && iptables -t mangle -X

systemctl restart docker

exit
rm -rf ~/.kube/*
  1. 运行 kubeadm init
sudo kubeadm init --ignore-preflight-errors=NumCPU --control-plane-endpoint master:6443 --pod-network-cidr 10.10.0.0/16
  1. 添加证书($HOME/.kube/config)
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
  1. 应用 CNI(在我的例子中是 calico)
kubectl apply -f calico.yaml

kubectl get nodes
NAME     STATUS   ROLES                  AGE   VERSION
master   Ready    control-plane,master   29m   v1.23.4

节点/工作人员:

  1. 确保 /etc/hosts 中有主条目
  2. 重新加入节点:
sudo kubeadm reset
sudo kubeadm join master:6443 --token aaaaaa.aaaaaaaaaaaaaaaa \
        --discovery-token-ca-cert-hash sha256:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

相关内容