无法销毁沙盒网络 + coredns pod 未运行 Kubernetes

2024-6-2 • tag-icon

这是我的设置：

Kubernetes簇

root@k8s-eu-1-master:~# kubectl get nodes
NAME                STATUS   ROLES           AGE   VERSION              
k8s-eu-1-master     Ready    control-plane   24h   v1.28.2                        
                                                                                                                   ----> IP Address:
k8s-eu-1-worker-1   Ready    <none>          24h   v1.28.2                                 xx.xxx.xxx.xxx
k8s-eu-1-worker-2   Ready    <none>          24h   v1.28.2                                 yy.yyy.yyy.yyy
k8s-eu-1-worker-3   Ready    <none>          24h   v1.28.2                                 zz.zzz.zzz.zz 
k8s-eu-1-worker-4   Ready    <none>          24h   v1.28.2                                 ww.www.www.ww
k8s-eu-1-worker-5   Ready    <none>          24h   v1.28.2                                 pp.ppp.ppp.ppp

calico版本：

root@k8s-eu-1-master:~# kubectl calico version
Client Version:    v3.26.3
Git commit:        bdb7878af
Unable to retrieve Cluster Version or Type: resource does not exist: ClusterInformation(default) with error: the server could not find the requested resource (get ClusterInformations.crd.projectcalico.org default)

Failed to destroy network for sandbox这些是让我发现错误和错误的步骤coredns pod not running：

我已将k8s-eu-1-master节点设置为NFS Client并将k8s-eu-1-worker节点设置为NFS Servers：

root@k8s-eu-1-master:~# sudo df -h | grep /srv/backups
xx.xxx.xxx.xxx:/srv/backups  391G  5.5G  366G   2% /mnt/data
yy.yyy.yyy.yyy:/srv/backups  391G  5.5G  366G   2% /mnt/data
zz.zzz.zzz.zz:/srv/backups   391G  5.5G  366G   2% /mnt/data
ww.www.www.ww:/srv/backups   391G  5.5G  366G   2% /mnt/data
pp.ppp.ppp.ppp:/srv/backups  391G  5.5G  366G   2% /mnt/data

我安装并部署了配置程序，显然没有错误：

root@k8s-eu-1-master:~# helm install second-nfs-subdir-external-provisioner nfs-subdir-external-provisioner/nfs-subdir-external-provisioner \
> --set nfs.server=xx.xxx.xxx.xxx \
> --set nfs.server=yy.yyy.yyy.yyy \
> --set nfs.server= zz.zzz.zzz.zz\
> --set nfs.server=ww.www.www.ww\
> --set nfs.server=pp.ppp.ppp.ppp \
> --set nfs.path=/srv/backups \
> --set storageClass.name=second-nfs-client \
> --set storageClass.provisionerName=k8s-signs.io/second-nfs-subdir-external-provisioner
NAME: second-nfs-subdir-external-provisioner
LAST DEPLOYED: Thu Nov  2 16:57:58 2023
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
root@k8s-eu-1-master:~#

但是second-nfs-subdir-external-provisionerpod 并没有开始运行：

root@k8s-eu-1-master:~# kubectl get pods
NAME                                                      READY   STATUS              RESTARTS   AGE
second-nfs-subdir-external-provisioner-7c678bc889-p59sz   0/1     ContainerCreating   0          10m

kubectl get events输出：

root@k8s-eu-1-master:~# kubectl get events
LAST SEEN   TYPE      REASON              OBJECT                                                         MESSAGE
50m         Normal    Scheduled           pod/second-nfs-subdir-external-provisioner-7c678bc889-p59sz    Successfully assigned default/second-nfs-subdir-external-provisioner-7c678bc889-p59sz to k8s-eu-1-worker-2
14m         Warning   FailedMount         pod/second-nfs-subdir-external-provisioner-7c678bc889-p59sz    MountVolume.SetUp failed for volume "nfs-subdir-external-provisioner-root" : mount failed: exit status 32...
50m         Normal    SuccessfulCreate    replicaset/second-nfs-subdir-external-provisioner-7c678bc889   Created pod: second-nfs-subdir-external-provisioner-7c678bc889-p59sz
50m         Normal    ScalingReplicaSet   deployment/second-nfs-subdir-external-provisioner              Scaled up replica set second-nfs-subdir-external-provisioner-7c678bc889 to 1

输出kubectl describe pod：： MountVolume.SetUp failed for volume "nfs-subdir-external-provisioner-root" : mount failed: exit status 32

root@k8s-eu-1-master:~# kubectl describe pod second-nfs-subdir-external-provisioner-7c678bc889-p59sz
Name:             second-nfs-subdir-external-provisioner-7c678bc889-p59sz
Namespace:        default
Priority:         0
Service Account:  second-nfs-subdir-external-provisioner
Node:             k8s-eu-1-worker-2/yy.yyy.yyy.yyy
Start Time:       Thu, 02 Nov 2023 16:57:59 +0100
Labels:           app=nfs-subdir-external-provisioner
                  pod-template-hash=7c678bc889
                  release=second-nfs-subdir-external-provisioner
Annotations:      <none>
Status:           Pending
IP:               
IPs:              <none>
Controlled By:    ReplicaSet/second-nfs-subdir-external-provisioner-7c678bc889
Containers:
  nfs-subdir-external-provisioner:
    Container ID:   
    Image:          registry.k8s.io/sig-storage/nfs-subdir-external-provisioner:v4.0.2
    Image ID:       
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Environment:
      PROVISIONER_NAME:  k8s-signs.io/second-nfs-subdir-external-provisioner
      NFS_SERVER:        pp.ppp.ppp.ppp
      NFS_PATH:          /srv/backups
    Mounts:
      /persistentvolumes from nfs-subdir-external-provisioner-root (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-k7v8p (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  nfs-subdir-external-provisioner-root:
    Type:      NFS (an NFS mount that lasts the lifetime of a pod)
    Server:    pp.ppp.ppp.ppp
    Path:      /srv/backups
    ReadOnly:  false
  kube-api-access-k7v8p:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason       Age                 From               Message
  ----     ------       ----                ----               -------
  Normal   Scheduled    11m                 default-scheduler  Successfully assigned default/second-nfs-subdir-external-provisioner-7c678bc889-p59sz to k8s-eu-1-worker-2
  Warning  FailedMount  43s (x13 over 11m)  kubelet            MountVolume.SetUp failed for volume "nfs-subdir-external-provisioner-root" : mount failed: exit status 32
Mounting command: mount
Mounting arguments: -t nfs pp.ppp.ppp.ppp/srv/backups /var/lib/kubelet/pods/19d87677-f68f-4c12-9788-b4c9ce8f30f1/volumes/kubernetes.io~nfs/nfs-subdir-external-provisioner-root
Output: mount.nfs: mounting pp.ppp.ppp.ppp:/srv/backups failed, reason given by server: No such file or directory

但在两个感兴趣的节点中没有明显的问题：

k8s-eu-1-worker-2节点 IP 地址：yy.yyy.yyy.yyy

root@k8s-eu-1-master:~# kubectl describe node k8s-eu-1-worker-2
Name:               k8s-eu-1-worker-2
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=k8s-eu-1-worker-2
                    kubernetes.io/os=linux
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 01 Nov 2023 16:17:32 +0100
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  k8s-eu-1-worker-2
  AcquireTime:     <unset>
  RenewTime:       Thu, 02 Nov 2023 17:19:54 +0100
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Thu, 02 Nov 2023 17:18:28 +0100   Thu, 02 Nov 2023 11:51:42 +0100   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Thu, 02 Nov 2023 17:18:28 +0100   Thu, 02 Nov 2023 11:51:42 +0100   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Thu, 02 Nov 2023 17:18:28 +0100   Thu, 02 Nov 2023 11:51:42 +0100   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Thu, 02 Nov 2023 17:18:28 +0100   Thu, 02 Nov 2023 11:51:42 +0100   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  yy.yyy.yyy.yyy
  Hostname:    k8s-eu-1-worker-2
Capacity:
  cpu:                10
  ephemeral-storage:  2061040144Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             61714444Ki
  pods:               110
Allocatable:
  cpu:                10
  ephemeral-storage:  1899454593566
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             61612044Ki
  pods:               110
System Info:
  Machine ID:                 53962e976f234a5bf6083be6653cd385
  System UUID:                29cbe26c-1bd1-4544-a085-0475c3f5c3b0
  Boot ID:                    7713e7e5-61fa-435b-a563-517c92934cae
  Kernel Version:             5.15.0-88-generic
  OS Image:                   Ubuntu 22.04.3 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.6.24
  Kubelet Version:            v1.28.2
  Kube-Proxy Version:         v1.28.2
PodCIDR:                      192.168.2.0/24
PodCIDRs:                     192.168.2.0/24
Non-terminated Pods:          (2 in total)
  Namespace                   Name                                                       CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                                                       ------------  ----------  ---------------  -------------  ---
  default                     second-nfs-subdir-external-provisioner-7c678bc889-p59sz    0 (0%)        0 (0%)      0 (0%)           0 (0%)         22m
  kube-system                 kube-proxy-n5vvt                                           0 (0%)        0 (0%)      0 (0%)           0 (0%)         25h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests  Limits
  --------           --------  ------
  cpu                0 (0%)    0 (0%)
  memory             0 (0%)    0 (0%)
  ephemeral-storage  0 (0%)    0 (0%)
  hugepages-1Gi      0 (0%)    0 (0%)
  hugepages-2Mi      0 (0%)    0 (0%)
Events:              <none>

k8s-eu-1-worker-5节点的 IP 地址为pp.ppp.ppp.ppp：

root@k8s-eu-1-master:~# kubectl describe node k8s-eu-1-worker-5
Name:               k8s-eu-1-worker-5
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=k8s-eu-1-worker-5
                    kubernetes.io/os=linux
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: unix:///var/run/containerd/containerd.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 01 Nov 2023 16:19:36 +0100
Taints:             <none>
Unschedulable:      false
Lease:
  HolderIdentity:  k8s-eu-1-worker-5
  AcquireTime:     <unset>
  RenewTime:       Thu, 02 Nov 2023 17:23:55 +0100
Conditions:
  Type             Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----             ------  -----------------                 ------------------                ------                       -------
  MemoryPressure   False   Thu, 02 Nov 2023 17:20:22 +0100   Thu, 02 Nov 2023 12:44:39 +0100   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure     False   Thu, 02 Nov 2023 17:20:22 +0100   Thu, 02 Nov 2023 12:44:39 +0100   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure      False   Thu, 02 Nov 2023 17:20:22 +0100   Thu, 02 Nov 2023 12:44:39 +0100   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready            True    Thu, 02 Nov 2023 17:20:22 +0100   Thu, 02 Nov 2023 12:44:39 +0100   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  pp.ppp.ppp.ppp
  Hostname:    k8s-eu-1-worker-5
Capacity:
  cpu:                4
  ephemeral-storage:  409659944Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             8128016Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  377542603766
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             8025616Ki
  pods:               110
System Info:
  Machine ID:                 3969e9606d136fc6909ff64d653e8c99
  System UUID:                dbaa5423-b883-4e92-a983-d0e6f506b6da
  Boot ID:                    b4d67e97-d9d7-4c50-872a-b76724f0b868
  Kernel Version:             5.15.0-88-generic
  OS Image:                   Ubuntu 22.04.3 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  containerd://1.6.24
  Kubelet Version:            v1.28.2
  Kube-Proxy Version:         v1.28.2
PodCIDR:                      192.168.5.0/24
PodCIDRs:                     192.168.5.0/24
Non-terminated Pods:          (1 in total)
  Namespace                   Name                CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                ------------  ----------  ---------------  -------------  ---
  kube-system                 kube-proxy-kghkc    0 (0%)        0 (0%)      0 (0%)           0 (0%)         25h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests  Limits
  --------           --------  ------
  cpu                0 (0%)    0 (0%)
  memory             0 (0%)    0 (0%)
  ephemeral-storage  0 (0%)    0 (0%)
  hugepages-1Gi      0 (0%)    0 (0%)
  hugepages-2Mi      0 (0%)    0 (0%)
Events:              <none>

检查了一些api-resources：

root@k8s-eu-1-master:~# kubectl get componentstatuses
Warning: v1 ComponentStatus is deprecated in v1.19+
NAME                 STATUS    MESSAGE   ERROR
scheduler            Healthy   ok        
controller-manager   Healthy   ok        
etcd-0               Healthy   ok

尾巴/var/log/syslog： failed to destroy network for sandbox

    Nov  2 18:04:08 k8s-eu-1-master containerd[542]: time="2023-11-02T18:04:08.485566179+01:00" level=error msg="StopPodSandbox for \"cb37071ee41b92adebe13244b916bf8de4f489066acecc196ae74fa2b8e41ca8\" failed" error="failed to destroy network for sandbox \"cb37071ee41b92adebe13244b916bf8de4f489066acecc196ae74fa2b8e41ca8\": plugin type=\"calico\" failed (delete): error getting ClusterInformation: Get \"https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default\": tls: failed to verify certificate: x509: certificate signed by unknown authority (possibly because of \"crypto/rsa: verification error\" while trying to verify candidate authority certificate \"kubernetes\")"

/etc/cni/net.d：

root@k8s-eu-1-master:~# ls -lah /etc/cni/net.d
total 16K
drwx------ 2 root root 4.0K Oct 30 19:00 .
drwxr-xr-x 3 root root 4.0K Oct 30 18:52 ..
-rw-r--r-- 1 root root  680 Oct 30 19:21 10-calico.conflist
-rw------- 1 root root 2.7K Oct 31 09:21 calico-kubeconfig

/opt/cni/bin/：

root@k8s-eu-1-master:~# ls -lah /opt/cni/bin/
total 247M
drwxrwxr-x 2 root root 4.0K Oct 30 19:21 .
drwxr-xr-x 3 root root 4.0K Oct 30 18:54 ..
-rwxr-xr-x 1 root root 3.9M Oct 30 19:00 bandwidth
-rwxr-xr-x 1 root root 4.1M Jan 16  2023 bridge
-rwsr-xr-x 1 root root  59M Oct 30 19:00 calico
-rwsr-xr-x 1 root root  59M Oct 30 19:00 calico-ipam
-rwxr-xr-x 1 root root 9.7M Jan 16  2023 dhcp
-rwxr-xr-x 1 root root 3.9M Jan 16  2023 dummy
-rwxr-xr-x 1 root root 4.2M Jan 16  2023 firewall
-rwxr-xr-x 1 root root 2.4M Oct 30 19:00 flannel
-rwxr-xr-x 1 root root 3.7M Jan 16  2023 host-device
-rwxr-xr-x 1 root root 3.4M Oct 30 19:00 host-local
-rwsr-xr-x 1 root root  59M Oct 30 19:00 install
-rwxr-xr-x 1 root root 3.9M Jan 16  2023 ipvlan
-rwxr-xr-x 1 root root 3.5M Oct 30 19:00 loopback
-rwxr-xr-x 1 root root 3.9M Jan 16  2023 macvlan
-rwxr-xr-x 1 root root 3.9M Oct 30 19:00 portmap
-rwxr-xr-x 1 root root 4.0M Jan 16  2023 ptp
-rwxr-xr-x 1 root root 3.4M Jan 16  2023 sbr
-rwxr-xr-x 1 root root 2.8M Jan 16  2023 static
-rwxr-xr-x 1 root root 3.6M Oct 30 19:00 tuning
-rwxr-xr-x 1 root root 3.9M Jan 16  2023 vlan
-rwxr-xr-x 1 root root 3.5M Jan 16  2023 vrf


root@k8s-eu-1-master:~# kubectl get pods -n kube-system -o wide
NAME                                      READY   STATUS              RESTARTS        AGE   IP               NODE                NOMINATED NODE   READINESS GATES
coredns-5dd5756b68-2wx9g                  0/1     ContainerCreating   0               25h   <none>           k8s-eu-1-master     <none>           <none>
coredns-5dd5756b68-82x68                  0/1     ContainerCreating   0               25h   <none>           k8s-eu-1-master     <none>           <none>
etcd-k8s-eu-1-master                      1/1     Running             15 (8h ago)     25h   38.242.249.60    k8s-eu-1-master     <none>           <none>
kube-apiserver-k8s-eu-1-master            1/1     Running             5 (8h ago)      25h   38.242.249.60    k8s-eu-1-master     <none>           <none>
kube-controller-manager-k8s-eu-1-master   1/1     Running             5 (8h ago)      25h   38.242.249.60    k8s-eu-1-master     <none>           <none>
kube-proxy-56h55                          1/1     Running             1 (5h49m ago)   25h   38.242.250.38    k8s-eu-1-worker-3   <none>           <none>
kube-proxy-kghkc                          1/1     Running             1 (5h30m ago)   25h   38.242.250.146   k8s-eu-1-worker-5   <none>           <none>
kube-proxy-l9bxl                          1/1     Running             5 (8h ago)      25h   38.242.249.60    k8s-eu-1-master     <none>           <none>
kube-proxy-n5vvt                          1/1     Running             1 (6h23m ago)   25h   38.242.249.124   k8s-eu-1-worker-2   <none>           <none>
kube-proxy-vxlsl                          1/1     Running             1 (5h40m ago)   25h   38.242.250.77    k8s-eu-1-worker-4   <none>           <none>
kube-proxy-zt8st                          1/1     Running             2 (21h ago)     25h   38.242.249.121   k8s-eu-1-worker-1   <none>           <none>
kube-scheduler-k8s-eu-1-master            1/1     Running             5 (8h ago)      25h   38.242.249.60    k8s-eu-1-master

输出kubectl get nodes -o jsonpath=" {.items[*].spec.taints}"：

root@k8s-eu-1-master:~# kubectl get nodes -o jsonpath="  
{.items[*].spec.taints}"
[{"effect":"NoSchedule","key":"node-role.kubernetes.io/control-plane"}

我需要检查什么才能找出问题的根源？如何才能解决问题？

答案1

我重新安装了tigera-operator.yaml，custom-resources.yaml网络又重新启动了

答案1

相关内容