Calico HA 集群 - 未就绪：无计划

2024-6-2 • tag-icon

在 K8s HA 集群上安装 Calico

注意：同样的安装在单节点安装上有效，可以删除污点。但在 HA 集群中，它只有污点：node.kubernetes.io/not-ready:NoSchedule

  kubectl create -f https://projectcalico.docs.tigera.io/manifests/tigera-operator.yaml
  curl https://projectcalico.docs.tigera.io/manifests/custom-resources.yaml -O
  kubectl create -f custom-resources.yaml

在工作节点上我有：

kubectl describe nodes <worker node> | grep -i taint
Taints:             node.kubernetes.io/not-ready:NoSchedule

然后我做了：

kubectl taint nodes <worker node> node.kubernetes.io/not-ready:NoSchedule-

这使我从待处理状态进入容器创建状态——问题是它永远无法完成创建？

kube-system       coredns-6d4b75cb6d-mlshg           0/1     ContainerCreating   0          35m

在工作节点上 - 描述 pod：

kubectl describe pod coredns-6d4b75cb6d-89ggn -n kube-system
Name:                 coredns-6d4b75cb6d-89ggn
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 master01worker01/192.168.64.7
Start Time:           Thu, 18 Aug 2022 14:47:22 +0200
Labels:               k8s-app=kube-dns
                      pod-template-hash=6d4b75cb6d
Annotations:          <none>
Status:               Pending
IP:                   
IPs:                  <none>
Controlled By:        ReplicaSet/coredns-6d4b75cb6d
Containers:
  coredns:
    Container ID:  
    Image:         k8s.gcr.io/coredns/coredns:v1.8.6
    Image ID:      
    Ports:         53/UDP, 53/TCP, 9153/TCP
    Host Ports:    0/UDP, 0/TCP, 0/TCP
    Args:
      -conf
      /etc/coredns/Corefile
    State:          Waiting
      Reason:       ContainerCreating
    Ready:          False
    Restart Count:  0
    Limits:
      memory:  170Mi
    Requests:
      cpu:        100m
      memory:     70Mi
    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:    http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3
    Environment:  <none>
    Mounts:
      /etc/coredns from config-volume (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-tbb27 (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      coredns
    Optional:  false
  kube-api-access-tbb27:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 CriticalAddonsOnly op=Exists
                             node-role.kubernetes.io/control-plane:NoSchedule
                             node-role.kubernetes.io/master:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason            Age                   From               Message
  ----     ------            ----                  ----               -------
  Warning  FailedScheduling  2m57s                 default-scheduler  0/1 nodes are available: 1 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
  Normal   Scheduled         2m15s                 default-scheduler  Successfully assigned kube-system/coredns-6d4b75cb6d-89ggn to master01worker01
  Warning  NetworkNotReady   98s (x18 over 2m10s)  kubelet            network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
  Warning  FailedMount       98s (x7 over 2m10s)   kubelet            MountVolume.SetUp failed for volume "config-volume" : object "kube-system"/"coredns" not registered

如果我删除 pod，我会得到：

Warning FailedScheduling 76s default-scheduler 0/2 nodes are available: 2 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling

然后如果我再次尝试使用以下方法取消污染：

kubectl taint nodes <conrtrol plane> node.kubernetes.io/not-ready:NoSchedule-

然后我回到事件：

  Type     Reason            Age              From               Message
  ----     ------            ----             ----               -------
  Warning  FailedScheduling  4m14s            default-scheduler  0/2 nodes are available: 2 node(s) had untolerated taint {node.kubernetes.io/not-ready: }. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
  Normal   Scheduled         3s               default-scheduler  Successfully assigned kube-system/coredns-6d4b75cb6d-gglk2 to master01worker01
  Warning  NetworkNotReady   1s (x2 over 3s)  kubelet            network is not ready: container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized
  Warning  FailedMount       1s (x3 over 2s)  kubelet            MountVolume.SetUp failed for volume "config-volume" : object "kube-system"/"coredns" not registered

在工作节点上 - 描述 pod：

相关内容