这是我的设置:
Kubernetes
簇
root@k8s-eu-1-master:~# kubectl get nodes
NAME STATUS ROLES AGE VERSION
k8s-eu-1-master Ready control-plane 24h v1.28.2
----> IP Address:
k8s-eu-1-worker-1 Ready <none> 24h v1.28.2 xx.xxx.xxx.xxx
k8s-eu-1-worker-2 Ready <none> 24h v1.28.2 yy.yyy.yyy.yyy
k8s-eu-1-worker-3 Ready <none> 24h v1.28.2 zz.zzz.zzz.zz
k8s-eu-1-worker-4 Ready <none> 24h v1.28.2 ww.www.www.ww
k8s-eu-1-worker-5 Ready <none> 24h v1.28.2 pp.ppp.ppp.ppp
calico
版本 :
root@k8s-eu-1-master:~# kubectl calico version
Client Version: v3.26.3
Git commit: bdb7878af
Unable to retrieve Cluster Version or Type: resource does not exist: ClusterInformation(default) with error: the server could not find the requested resource (get ClusterInformations.crd.projectcalico.org default)
Failed to destroy network for sandbox
这些是让我发现错误和错误的步骤coredns pod not running
:
我已将k8s-eu-1-master
节点设置为NFS Client
并将k8s-eu-1-worker
节点设置为NFS Servers
:
root@k8s-eu-1-master:~# sudo df -h | grep /srv/backups
xx.xxx.xxx.xxx:/srv/backups 391G 5.5G 366G 2% /mnt/data
yy.yyy.yyy.yyy:/srv/backups 391G 5.5G 366G 2% /mnt/data
zz.zzz.zzz.zz:/srv/backups 391G 5.5G 366G 2% /mnt/data
ww.www.www.ww:/srv/backups 391G 5.5G 366G 2% /mnt/data
pp.ppp.ppp.ppp:/srv/backups 391G 5.5G 366G 2% /mnt/data
我安装并部署了配置程序,显然没有错误:
root@k8s-eu-1-master:~# helm install second-nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
> --set nfs.server=xx.xxx.xxx.xxx \
> --set nfs.server=yy.yyy.yyy.yyy \
> --set nfs.server= zz.zzz.zzz.zz\
> --set nfs.server=ww.www.www.ww\
> --set nfs.server=pp.ppp.ppp.ppp \
> --set nfs.path=/srv/backups \
> --set storageClass.name=second-nfs-client \
> --set storageClass.provisionerName=k8s-signs.io/second-nfs-subdir-external-provisioner
NAME: second-nfs-subdir-external-provisioner
LAST DEPLOYED: Thu Nov 2 16:57:58 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
root@k8s-eu-1-master:~#
但是second-nfs-subdir-external-provisioner
pod 并没有开始运行:
root@k8s-eu-1-master:~# kubectl get pods
NAME READY STATUS RESTARTS AGE
second-nfs-subdir-external-provisioner-7c678bc889-p59sz 0/1 ContainerCreating 0 10m
kubectl get events
输出:
root@k8s-eu-1-master:~# kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
50m Normal Scheduled pod/second-nfs-subdir-external-provisioner-7c678bc889-p59sz Successfully assigned default/second-nfs-subdir-external-provisioner-7c678bc889-p59sz to k8s-eu-1-worker-2
14m Warning FailedMount pod/second-nfs-subdir-external-provisioner-7c678bc889-p59sz MountVolume.SetUp failed for volume "nfs-subdir-external-provisioner-root" : mount failed: exit status 32...
50m Normal SuccessfulCreate replicaset/second-nfs-subdir-external-provisioner-7c678bc889 Created pod: second-nfs-subdir-external-provisioner-7c678bc889-p59sz
50m Normal ScalingReplicaSet deployment/second-nfs-subdir-external-provisioner Scaled up replica set second-nfs-subdir-external-provisioner-7c678bc889 to 1
输出kubectl describe pod
:: MountVolume.SetUp failed for volume "nfs-subdir-external-provisioner-root" : mount failed: exit status 32
root@k8s-eu-1-master:~# kubectl describe pod second-nfs-subdir-external-provisioner-7c678bc889-p59sz
Name: second-nfs-subdir-external-provisioner-7c678bc889-p59sz
Namespace: default
Priority: 0
Service Account: second-nfs-subdir-external-provisioner
Node: k8s-eu-1-worker-2/yy.yyy.yyy.yyy
Start Time: Thu, 02 Nov 2023 16:57:59 +0100
Labels: app=nfs-subdir-external-provisioner
pod-template-hash=7c678bc889
release=second-nfs-subdir-external-provisioner
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: ReplicaSet/second-nfs-subdir-external-provisioner-7c678bc889
Containers:
nfs-subdir-external-provisioner:
Container ID:
Image: registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2
Image ID:
Port: <none>
Host Port: <none>
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
PROVISIONER_NAME: k8s-signs.io/second-nfs-subdir-external-provisioner
NFS_SERVER: pp.ppp.ppp.ppp
NFS_PATH: /srv/backups
Mounts:
/persistentvolumes from nfs-subdir-external-provisioner-root (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-k7v8p (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
nfs-subdir-external-provisioner-root:
Type: NFS (an NFS mount that lasts the lifetime of a pod)
Server: pp.ppp.ppp.ppp
Path: /srv/backups
ReadOnly: false
kube-api-access-k7v8p:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 11m default-scheduler Successfully assigned default/second-nfs-subdir-external-provisioner-7c678bc889-p59sz to k8s-eu-1-worker-2
Warning FailedMount 43s (x13 over 11m) kubelet MountVolume.SetUp failed for volume "nfs-subdir-external-provisioner-root" : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs pp.ppp.ppp.ppp/srv/backups /var/lib/kubelet/pods/19d87677-f68f-4c12-9788-b4c9ce8f30f1/volumes/kubernetes.io~nfs/nfs-subdir-external-provisioner-root
Output: mount.nfs: mounting pp.ppp.ppp.ppp:/srv/backups failed, reason given by server: No such file or directory
但在两个感兴趣的节点中没有明显的问题:
k8s-eu-1-worker-2
节点 IP 地址:yy.yyy.yyy.yyy
root@k8s-eu-1-master:~# kubectl describe node k8s-eu-1-worker-2
Name: k8s-eu-1-worker-2
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=k8s-eu-1-worker-2
kubernetes.io/os=linux
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Wed, 01 Nov 2023 16:17:32 +0100
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: k8s-eu-1-worker-2
AcquireTime: <unset>
RenewTime: Thu, 02 Nov 2023 17:19:54 +0100
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Thu, 02 Nov 2023 17:18:28 +0100 Thu, 02 Nov 2023 11:51:42 +0100 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Thu, 02 Nov 2023 17:18:28 +0100 Thu, 02 Nov 2023 11:51:42 +0100 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Thu, 02 Nov 2023 17:18:28 +0100 Thu, 02 Nov 2023 11:51:42 +0100 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Thu, 02 Nov 2023 17:18:28 +0100 Thu, 02 Nov 2023 11:51:42 +0100 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: yy.yyy.yyy.yyy
Hostname: k8s-eu-1-worker-2
Capacity:
cpu: 10
ephemeral-storage: 2061040144Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 61714444Ki
pods: 110
Allocatable:
cpu: 10
ephemeral-storage: 1899454593566
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 61612044Ki
pods: 110
System Info:
Machine ID: 53962e976f234a5bf6083be6653cd385
System UUID: 29cbe26c-1bd1-4544-a085-0475c3f5c3b0
Boot ID: 7713e7e5-61fa-435b-a563-517c92934cae
Kernel Version: 5.15.0-88-generic
OS Image: Ubuntu 22.04.3 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: containerd://1.6.24
Kubelet Version: v1.28.2
Kube-Proxy Version: v1.28.2
PodCIDR: 192.168.2.0/24
PodCIDRs: 192.168.2.0/24
Non-terminated Pods: (2 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default second-nfs-subdir-external-provisioner-7c678bc889-p59sz 0 (0%) 0 (0%) 0 (0%) 0 (0%) 22m
kube-system kube-proxy-n5vvt 0 (0%) 0 (0%) 0 (0%) 0 (0%) 25h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 0 (0%) 0 (0%)
memory 0 (0%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
k8s-eu-1-worker-5
节点的 IP 地址为pp.ppp.ppp.ppp
:
root@k8s-eu-1-master:~# kubectl describe node k8s-eu-1-worker-5
Name: k8s-eu-1-worker-5
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=k8s-eu-1-worker-5
kubernetes.io/os=linux
Annotations: kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Wed, 01 Nov 2023 16:19:36 +0100
Taints: <none>
Unschedulable: false
Lease:
HolderIdentity: k8s-eu-1-worker-5
AcquireTime: <unset>
RenewTime: Thu, 02 Nov 2023 17:23:55 +0100
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
MemoryPressure False Thu, 02 Nov 2023 17:20:22 +0100 Thu, 02 Nov 2023 12:44:39 +0100 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Thu, 02 Nov 2023 17:20:22 +0100 Thu, 02 Nov 2023 12:44:39 +0100 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Thu, 02 Nov 2023 17:20:22 +0100 Thu, 02 Nov 2023 12:44:39 +0100 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Thu, 02 Nov 2023 17:20:22 +0100 Thu, 02 Nov 2023 12:44:39 +0100 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: pp.ppp.ppp.ppp
Hostname: k8s-eu-1-worker-5
Capacity:
cpu: 4
ephemeral-storage: 409659944Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 8128016Ki
pods: 110
Allocatable:
cpu: 4
ephemeral-storage: 377542603766
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 8025616Ki
pods: 110
System Info:
Machine ID: 3969e9606d136fc6909ff64d653e8c99
System UUID: dbaa5423-b883-4e92-a983-d0e6f506b6da
Boot ID: b4d67e97-d9d7-4c50-872a-b76724f0b868
Kernel Version: 5.15.0-88-generic
OS Image: Ubuntu 22.04.3 LTS
Operating System: linux
Architecture: amd64
Container Runtime Version: containerd://1.6.24
Kubelet Version: v1.28.2
Kube-Proxy Version: v1.28.2
PodCIDR: 192.168.5.0/24
PodCIDRs: 192.168.5.0/24
Non-terminated Pods: (1 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
kube-system kube-proxy-kghkc 0 (0%) 0 (0%) 0 (0%) 0 (0%) 25h
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 0 (0%) 0 (0%)
memory 0 (0%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
Events: <none>
检查了一些api-resources
:
root@k8s-eu-1-master:~# kubectl get componentstatuses
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Healthy ok
etcd-0 Healthy ok
尾巴/var/log/syslog
: failed to destroy network for sandbox
Nov 2 18:04:08 k8s-eu-1-master containerd[542]: time="2023-11-02T18:04:08.485566179+01:00" level=error msg="StopPodSandbox for \"cb37071ee41b92adebe13244b916bf8de4f489066acecc196ae74fa2b8e41ca8\" failed" error="failed to destroy network for sandbox \"cb37071ee41b92adebe13244b916bf8de4f489066acecc196ae74fa2b8e41ca8\": plugin type=\"calico\" failed (delete): error getting ClusterInformation: Get \"https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"
/etc/cni/net.d
:
root@k8s-eu-1-master:~# ls -lah /etc/cni/net.d
total 16K
drwx------ 2 root root 4.0K Oct 30 19:00 .
drwxr-xr-x 3 root root 4.0K Oct 30 18:52 ..
-rw-r--r-- 1 root root 680 Oct 30 19:21 10-calico.conflist
-rw------- 1 root root 2.7K Oct 31 09:21 calico-kubeconfig
/opt/cni/bin/
:
root@k8s-eu-1-master:~# ls -lah /opt/cni/bin/
total 247M
drwxrwxr-x 2 root root 4.0K Oct 30 19:21 .
drwxr-xr-x 3 root root 4.0K Oct 30 18:54 ..
-rwxr-xr-x 1 root root 3.9M Oct 30 19:00 bandwidth
-rwxr-xr-x 1 root root 4.1M Jan 16 2023 bridge
-rwsr-xr-x 1 root root 59M Oct 30 19:00 calico
-rwsr-xr-x 1 root root 59M Oct 30 19:00 calico-ipam
-rwxr-xr-x 1 root root 9.7M Jan 16 2023 dhcp
-rwxr-xr-x 1 root root 3.9M Jan 16 2023 dummy
-rwxr-xr-x 1 root root 4.2M Jan 16 2023 firewall
-rwxr-xr-x 1 root root 2.4M Oct 30 19:00 flannel
-rwxr-xr-x 1 root root 3.7M Jan 16 2023 host-device
-rwxr-xr-x 1 root root 3.4M Oct 30 19:00 host-local
-rwsr-xr-x 1 root root 59M Oct 30 19:00 install
-rwxr-xr-x 1 root root 3.9M Jan 16 2023 ipvlan
-rwxr-xr-x 1 root root 3.5M Oct 30 19:00 loopback
-rwxr-xr-x 1 root root 3.9M Jan 16 2023 macvlan
-rwxr-xr-x 1 root root 3.9M Oct 30 19:00 portmap
-rwxr-xr-x 1 root root 4.0M Jan 16 2023 ptp
-rwxr-xr-x 1 root root 3.4M Jan 16 2023 sbr
-rwxr-xr-x 1 root root 2.8M Jan 16 2023 static
-rwxr-xr-x 1 root root 3.6M Oct 30 19:00 tuning
-rwxr-xr-x 1 root root 3.9M Jan 16 2023 vlan
-rwxr-xr-x 1 root root 3.5M Jan 16 2023 vrf
root@k8s-eu-1-master:~# kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-5dd5756b68-2wx9g 0/1 ContainerCreating 0 25h <none> k8s-eu-1-master <none> <none>
coredns-5dd5756b68-82x68 0/1 ContainerCreating 0 25h <none> k8s-eu-1-master <none> <none>
etcd-k8s-eu-1-master 1/1 Running 15 (8h ago) 25h 38.242.249.60 k8s-eu-1-master <none> <none>
kube-apiserver-k8s-eu-1-master 1/1 Running 5 (8h ago) 25h 38.242.249.60 k8s-eu-1-master <none> <none>
kube-controller-manager-k8s-eu-1-master 1/1 Running 5 (8h ago) 25h 38.242.249.60 k8s-eu-1-master <none> <none>
kube-proxy-56h55 1/1 Running 1 (5h49m ago) 25h 38.242.250.38 k8s-eu-1-worker-3 <none> <none>
kube-proxy-kghkc 1/1 Running 1 (5h30m ago) 25h 38.242.250.146 k8s-eu-1-worker-5 <none> <none>
kube-proxy-l9bxl 1/1 Running 5 (8h ago) 25h 38.242.249.60 k8s-eu-1-master <none> <none>
kube-proxy-n5vvt 1/1 Running 1 (6h23m ago) 25h 38.242.249.124 k8s-eu-1-worker-2 <none> <none>
kube-proxy-vxlsl 1/1 Running 1 (5h40m ago) 25h 38.242.250.77 k8s-eu-1-worker-4 <none> <none>
kube-proxy-zt8st 1/1 Running 2 (21h ago) 25h 38.242.249.121 k8s-eu-1-worker-1 <none> <none>
kube-scheduler-k8s-eu-1-master 1/1 Running 5 (8h ago) 25h 38.242.249.60 k8s-eu-1-master
输出kubectl get nodes -o jsonpath=" {.items[*].spec.taints}"
:
root@k8s-eu-1-master:~# kubectl get nodes -o jsonpath="
{.items[*].spec.taints}"
[{"effect":"NoSchedule","key":"node-role.kubernetes.io/control-plane"}
我需要检查什么才能找出问题的根源?如何才能解决问题?
答案1
我重新安装了tigera-operator.yaml
,custom-resources.yaml
网络又重新启动了