我才刚刚踏入 k8s 领域,所以请耐心等待......
我正在尝试启动一个小型 k8s 集群,并使用 Singularity 作为容器运行时。我正在关注此程序。
问题是 coredns pod 无法启动,原因如下:
Jan 27 07:18:15 cent8ws sycri[1302]: #011Error: rpc error: code = Internal desc = could not set up pod network interface: error getting ClusterInformation: Get "https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default": x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
我谷歌了一下这个问题,发现这个问题被提出过很多次,但是所有建议的解决方案都对我不起作用。我不确定这是否与我使用 Singularity 有关... 请参阅下面有关我的系统和 ks 集群状态的一些详细信息。
任何帮助都感激不尽。
非常感谢,
奥伦
root@cent8ws ~]# cat /etc/redhat-release
CentOS Linux release 8.3.2011
[root@cent8ws ~]# rpm -qa | grep -e kub -e sing
webrtc-audio-processing-0.3-9.el8.x86_64
kubelet-1.20.2-0.x86_64
kubectl-1.20.2-0.x86_64
kubernetes-cni-0.8.7-0.x86_64
singularity-3.7.0-1.el8.x86_64
kubeadm-1.20.2-0.x86_64
kubeadm init --pod-network-cidr=10.0.1.0/24 --cri-socket unix:///var/run/singularity.sock --ignore-preflight-errors=All --upload-certs --node-name=$HOSTNAME
Jan 27 07:18:15 cent8ws sycri[1302]: #011Error: rpc error: code = Internal desc = could not set up pod network interface: error getting ClusterInformation: Get "https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default": x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
[root@cent8ws ~]# kubectl get pods --namespace=kube-system
NAME READY STATUS RESTARTS AGE
coredns-74ff55c5b-dldl2 0/1 ContainerCreating 0 5m50s
coredns-74ff55c5b-tbhkw 0/1 ContainerCreating 0 5m50s
etcd-cent8ws.localdomain 1/1 Running 0 5m52s
kube-apiserver-cent8ws.localdomain 1/1 Running 0 5m51s
kube-controller-manager-cent8ws.localdomain 1/1 Running 0 5m51s
kube-proxy-wb62q 1/1 Running 0 5m50s
kube-scheduler-cent8ws.localdomain 1/1 Running 0 5m52s
root@cent8ws ~]# kubectl describe pods coredns-74ff55c5b-tbhkw --namespace=kube-system
Name: coredns-74ff55c5b-tbhkw
Namespace: kube-system
Priority: 2000000000
Priority Class Name: system-cluster-critical
Node: cent8ws.localdomain/192.168.122.1
Start Time: Wed, 27 Jan 2021 07:17:47 +0200
Labels: k8s-app=kube-dns
pod-template-hash=74ff55c5b
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/coredns-74ff55c5b
Containers:
coredns:
Container ID:
Image: k8s.gcr.io/coredns:1.7.0
Image ID:
Ports: 53/UDP, 53/TCP, 9153/TCP
Host Ports: 0/UDP, 0/TCP, 0/TCP
Args:
-conf
/etc/coredns/Corefile
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Limits:
memory: 170Mi
Requests:
cpu: 100m
memory: 70Mi
Liveness: http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
Readiness: http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/etc/coredns from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from coredns-token-29225 (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
coredns-token-29225:
Type: Secret (a volume populated by a Secret)
SecretName: coredns-token-29225
Optional: false
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule
node-role.kubernetes.io/master:NoSchedule
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m46s default-scheduler Successfully assigned kube-system/coredns-74ff55c5b-tbhkw to cent8ws.localdomain
Warning FailedCreatePodSandBox 70s (x26 over 6m45s) kubelet Failed to create pod sandbox: rpc error: code = Internal desc = could not set up pod network interface: error getting ClusterInformation: Get "https://[10.96.0.1]:443/apis/crd.projectcalico.org/v1/clusterinformations/default": x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
答案1
为了修复此问题。我在 centos 机器上运行了以下命令。
用于提取最新图像
sudo kubeadm config images pull
删除了 IP 链接:
ip link list | grep cali | awk '{print $2}' | cut -c 1-15 | xargs -I {} ip link delete {}
尝试运行这些步骤master and worker nodes
。
移动或删除此位置的 calico 文件
/etc/cni/net.d
重启 kubelete
sudo systemctl restart kubelet.service
尝试删除未运行的 podkubectl delete pods coredns-558bd4d5db-t6d7r -n kube-system
并kubelet get pods -A