系统详细信息:
- 库伯内特版本:1.20.1
- 主机操作系统:Ubuntu 20.04.4 LTS
我今天在我的 Kubernetes 实验室(在 VirtualBox 上)启动了它,任何kubectl
命令都会像这样结束:
$ kubectl get nodes -o wide
The connection to the server master:6443 was refused - did you specify the right host or port?
$ kubectl get po --all-namespaces
The connection to the server master:6443 was refused - did you specify the right host or port?
事实上,socket master:6443 上没有任何监听:
$ ss -tl4np
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 4096 127.0.0.1:34627 0.0.0.0:*
LISTEN 0 4096 127.0.0.1:10248 0.0.0.0:*
LISTEN 0 4096 192.168.1.190:2379 0.0.0.0:*
LISTEN 0 4096 127.0.0.1:2379 0.0.0.0:*
LISTEN 0 4096 192.168.1.190:2380 0.0.0.0:*
LISTEN 0 4096 127.0.0.1:2381 0.0.0.0:*
LISTEN 0 4096 127.0.0.1:33133 0.0.0.0:*
LISTEN 0 4096 127.0.0.1:10257 0.0.0.0:*
LISTEN 0 4096 127.0.0.1:10259 0.0.0.0:*
LISTEN 0 4096 127.0.0.53%lo:53 0.0.0.0:*
LISTEN 0 128 0.0.0.0:22 0.0.0.0:*
ccd@master:~$ curl https://master:6443
curl: (7) Failed to connect to master port 6443: Connection refused
问题:
- 如何启动 kube-apiserver?
- 为什么 kube-apiserver 突然消失了?上次我在这个实验室时它运行得很好。
旁注:kubelet 服务正在运行,但处于错误状态。
Jan 17 20:40:43 master kubelet[613]: E0117 20:40:43.451465 613 kubelet.go:2422] "Error getting node" err="node \"master\" not found"
Jan 17 20:40:43 master kubelet[613]: E0117 20:40:43.552175 613 kubelet.go:2422] "Error getting node" err="node \"master\" not found"
[更新]
在我的实验室中,kube-apiserver 似乎作为 docker 容器运行:
root@master:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0adc676e6077 595f327f224a "kube-scheduler --au…" 10 minutes ago Up 10 minutes k8s_kube-scheduler_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_7
8e5200bd131f 25f8c7f3da61 "etcd --advertise-cl…" 10 minutes ago Up 10 minutes k8s_etcd_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_7
7ad4134212ce df7b72818ad2 "kube-controller-man…" 10 minutes ago Up 10 minutes k8s_kube-controller-manager_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_7
6baca0741b74 k8s.gcr.io/pause:3.6 "/pause" 10 minutes ago Up 10 minutes k8s_POD_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_7
9a5572813b8a k8s.gcr.io/pause:3.6 "/pause" 10 minutes ago Up 10 minutes k8s_POD_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_7
daca7dae837c k8s.gcr.io/pause:3.6 "/pause" 10 minutes ago Up 10 minutes k8s_POD_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_7
10871d3a6e66 k8s.gcr.io/pause:3.6 "/pause" 10 minutes ago Up 10 minutes k8s_POD_kube-apiserver-master_kube-system_01ecc291decf2feb7dae990cc7eb8cb6_7
在命令列中显示“/pause”。我尝试取消暂停,但没有成功。 :(
root@master:~# docker unpause 10871d3a6e66
Error response from daemon: Container 10871d3a6e66c897af6cfa33ea4c45668045ab15bb29beab938556937589e3ad is not paused
该docker logs 10871d3a6e66
命令报告:
Shutting down, got signal: Terminated
[更新] 我实际上可以看到 kube-apiserver 尝试启动,但它似乎每次尝试都会崩溃。查看列表中的第一个容器。
$ sudo docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
bed0b4e9170c 8fa62c12256d "kube-apiserver --ad…" 39 seconds ago Exited (1) 14 seconds ago k8s_kube-apiserver_kube-apiserver-master_kube-system_01ecc291decf2feb7dae990cc7eb8cb6_9
fda323106b5f 25f8c7f3da61 "etcd --advertise-cl…" 7 minutes ago Up 7 minutes k8s_etcd_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_4
a05557b1d66b k8s.gcr.io/pause:3.6 "/pause" 7 minutes ago Up 7 minutes k8s_POD_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_4
0befad0f3003 df7b72818ad2 "kube-controller-man…" 8 minutes ago Up 8 minutes k8s_kube-controller-manager_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_4
8f3b2a3756a9 595f327f224a "kube-scheduler --au…" 8 minutes ago Up 8 minutes k8s_kube-scheduler_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_4
3aeb39d118d3 k8s.gcr.io/pause:3.6 "/pause" 8 minutes ago Up 8 minutes k8s_POD_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_4
8b0c53496168 k8s.gcr.io/pause:3.6 "/pause" 8 minutes ago Up 8 minutes k8s_POD_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_4
bd5f13dee325 k8s.gcr.io/pause:3.6 "/pause" 8 minutes ago Up 8 minutes k8s_POD_kube-apiserver-master_kube-system_01ecc291decf2feb7dae990cc7eb8cb6_4
6c081ae55883 fd1608dbbc19 "start_runit" 9 months ago Exited (0) 9 months ago k8s_calico-node_calico-node-q5cpc_kube-system_27708358-127b-4078-9b42-e4953190ed80_3
d5acca3521c4 a4ca41631cc7 "/coredns -conf /etc…" 9 months ago Exited (0) 9 months ago k8s_coredns_coredns-64897985d-rmgtq_kube-system_2e772c34-7c77-4862-bab0-93be9f371095_3
4b3b0dca6679 a1a88662416b "/usr/bin/kube-contr…" 9 months ago Exited (2) 9 months ago k8s_calico-kube-controllers_calico-kube-controllers-7c845d499-95nlh_kube-system_bbb4637d-e143-46e5-97df-4ebc36c455b3_3
428d125b24d9 a4ca41631cc7 "/coredns -conf /etc…" 9 months ago Exited (0) 9 months ago k8s_coredns_coredns-64897985d-dwrpn_kube-system_5492f4f3-cc4d-4461-b57f-bb03f8fbc6a2_3
78a0ecbd465b d6660bf471e1 "/usr/local/bin/flex…" 9 months ago Exited (0) 9 months ago k8s_flexvol-driver_calico-node-q5cpc_kube-system_27708358-127b-4078-9b42-e4953190ed80_0
03a6837fc5bd k8s.gcr.io/pause:3.6 "/pause" 9 months ago Exited (0) 9 months ago k8s_POD_calico-kube-controllers-7c845d499-95nlh_kube-system_bbb4637d-e143-46e5-97df-4ebc36c455b3_12
da8ae7c12d4f k8s.gcr.io/pause:3.6 "/pause" 9 months ago Exited (0) 9 months ago k8s_POD_coredns-64897985d-rmgtq_kube-system_2e772c34-7c77-4862-bab0-93be9f371095_12
c6afacd9d088 k8s.gcr.io/pause:3.6 "/pause" 9 months ago Exited (0) 9 months ago k8s_POD_coredns-64897985d-dwrpn_kube-system_5492f4f3-cc4d-4461-b57f-bb03f8fbc6a2_12
cbdcedf7183a be7dfc21ba2e "/opt/cni/bin/install" 9 months ago Exited (0) 9 months ago k8s_install-cni_calico-node-q5cpc_kube-system_27708358-127b-4078-9b42-e4953190ed80_0
0d1c5aeee222 be7dfc21ba2e "/opt/cni/bin/calico…" 9 months ago Exited (0) 9 months ago k8s_upgrade-ipam_calico-node-q5cpc_kube-system_27708358-127b-4078-9b42-e4953190ed80_1
d5c211817c38 4c0375452406 "/usr/local/bin/kube…" 9 months ago Exited (2) 9 months ago k8s_kube-proxy_kube-proxy-dmhv4_kube-system_ae5b3e51-be54-41d0-8aeb-4049756a0e0a_3
27534d78c56f k8s.gcr.io/pause:3.6 "/pause" 9 months ago Exited (0) 9 months ago k8s_POD_calico-node-q5cpc_kube-system_27708358-127b-4078-9b42-e4953190ed80_3
95876ad85c66 k8s.gcr.io/pause:3.6 "/pause" 9 months ago Exited (0) 9 months ago k8s_POD_kube-proxy-dmhv4_kube-system_ae5b3e51-be54-41d0-8aeb-4049756a0e0a_3
ba442d9c0cb0 595f327f224a "kube-scheduler --au…" 9 months ago Exited (0) 9 months ago k8s_kube-scheduler_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_3
14cc628ce7ee df7b72818ad2 "kube-controller-man…" 9 months ago Exited (2) 9 months ago k8s_kube-controller-manager_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_3
7aca9140ee03 25f8c7f3da61 "etcd --advertise-cl…" 9 months ago Exited (0) 9 months ago k8s_etcd_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_3
f220f955a404 k8s.gcr.io/pause:3.6 "/pause" 9 months ago Exited (0) 9 months ago k8s_POD_kube-scheduler-master_kube-system_172567758300a3c99b36d0d4efd9321a_3
c16d35dba1fb k8s.gcr.io/pause:3.6 "/pause" 9 months ago Exited (0) 9 months ago k8s_POD_etcd-master_kube-system_dd8c7aeab8fa24c31e933da39dcafd96_3
80e581f9529c k8s.gcr.io/pause:3.6 "/pause" 9 months ago Exited (0) 9 months ago k8s_POD_kube-controller-manager-master_kube-system_b38aa758e725e1c490cee42d42ec8bff_3
8cc54605041f k8s.gcr.io/pause:3.6 "/pause" 21 months ago Exited (0) 21 months ago k8s_POD_coredns-64897985d-dwrpn_kube-system_5492f4f3-cc4d-4461-b57f-bb03f8fbc6a2_11
666788dcedf5 k8s.gcr.io/pause:3.6 "/pause" 21 months ago Exited (0) 21 months ago k8s_POD_calico-kube-controllers-7c845d499-95nlh_kube-system_bbb4637d-e143-46e5-97df-4ebc36c455b3_11
84046839fc1b k8s.gcr.io/pause:3.6 "/pause" 21 months ago Exited (0) 21 months ago k8s_POD_coredns-64897985d-rmgtq_kube-system_2e772c34-7c77-4862-bab0-93be9f371095_11
我也尝试像这样手动运行它,但这不起作用:
$ sudo docker run -it 8fa62c12256d
2024/01/22 12:57:56 not enough arguments to run
[解决方法]
我找不到解决我的问题的方法。因为这只是我的 VirtualBox LAB 系统(并且我不介意丢失数据/项目),我已经重新启动了它(集群)。
- 重置主节点:
sudo su -
kubeadm reset -f
rm -rf /etc/cni /etc/kubernetes /var/lib/dockershim /var/lib/etcd /var/lib/kubelet /var/run/kubernetes ~/.kube/*
iptables -F && iptables -X
iptables -t nat -F && iptables -t nat -X
iptables -t raw -F && iptables -t raw -X
iptables -t mangle -F && iptables -t mangle -X
systemctl restart docker
exit
rm -rf ~/.kube/*
- 运行 kubeadm init
sudo kubeadm init --ignore-preflight-errors=NumCPU --control-plane-endpoint master:6443 --pod-network-cidr 10.10.0.0/16
- 添加证书($HOME/.kube/config)
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
- 应用 CNI(在我的例子中是 calico)
kubectl apply -f calico.yaml
kubectl get nodes
NAME STATUS ROLES AGE VERSION
master Ready control-plane,master 29m v1.23.4
节点/工作人员:
- 确保 /etc/hosts 中有主条目
- 重新加入节点:
sudo kubeadm reset
sudo kubeadm join master:6443 --token aaaaaa.aaaaaaaaaaaaaaaa \
--discovery-token-ca-cert-hash sha256:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx