在 GCP 中受污染的节点上调度 kube-dns

Question 1

目前使用 1.1.5.12.-gke 您至少应该部署：

kube-dns部署
kube-dns-autoscaler部署

kube-dns 可以扩展以满足集群的 DNS 需求。此扩展由kube-dns-自动缩放器它默认部署在所有 GKE 集群中。kube-dns-autoscaler 根据集群中的节点和核心数量调整 kube-dns 部署中的副本数量。

在集群中调整 kube-dns 的首选方式应为：

通过配置 kube-dns-autoscaler配置图


    linear: '{"coresPerReplica":256,"min":1,"nodesPerReplica":16, "preventSinglePointFailure": true}'

在哪里：

“preventSinglePointFailure”：true controller ensures at least 2 replicas if there are more than one node.

对于当前副本，使用此参数将计算如下：

    replicas = max( ceil( cores × 1/coresPerReplica ) , ceil( nodes × 1/nodesPerReplica ) )

手动：

    kubectl scale --replicas=0 deployment/kube-dns-autoscaler --namespace=kube-system
    kubectl scale --replicas=1 deployment/kube-dns --namespace=kube-system

目前您遇到的问题源于默认kube-dns部署配置：

toleration:
    - key: CriticalAddonsOnly
      operator: Exists
    - key: components.gke.io/gke-managed-components
      operator: Exists

此配置可防止在具有自定义污点的节点上调度 pod。

我建议验证一下——为什么你的 pod 无法在默认池中的集群中调度（可能是由于默认池中缺少资源）并且我会考虑调整这个默认池的大小。

另一个解决方案是部署自定义kube-dns or core-dns 配置。

Answer