我在 AWS 上运行一个集群kops
。由于我需要在集群的同一 VPC 中拥有实例,因此我重用了一个现有子网:
kops create cluster --cloud=aws --zones=us-east-2a --node-size=t3.small --master-size=t3.small --name=${KOPS_CLUSTER_NAME} --subnets=subnet-c717c9ae --yes
然而我经常遇到错误:
$ kubectl logs -n kube-system -f aws-cloud-controller-manager-gv9bkgg
...
E1004 19:20:17.261728 1 route_controller.go:124] Couldn't reconcile node routes: error listing routes: unable to find route table for AWS cluster: cluster.mydomain.com
KubernetesCluster=cluster.mydomain.com
然后,我在子网路由表中添加了一个标签,解决了这个问题,但却产生了另一个问题:
I1004 20:44:37.606138 1 route_controller.go:199] Creating route for node i-06ac28fc2ced86895 100.96.2.0/24 with hint 05fd5d46-9513-4e89-8e50-1f44227549d4, throttled 15.465µs
I1004 20:44:37.606214 1 route_controller.go:199] Creating route for node i-03466cd9918eb7781 100.96.3.0/24 with hint d231da9b-b14d-48be-8ab4-ad3a18594071, throttled 4.837µs
I1004 20:44:38.343828 1 route_controller.go:219] Created route for node i-03466cd9918eb7781 100.96.3.0/24 with hint d231da9b-b14d-48be-8ab4-ad3a18594071 after 737.608324ms
I1004 20:44:38.360854 1 route_controller.go:219] Created route for node i-06ac28fc2ced86895 100.96.2.0/24 with hint 05fd5d46-9513-4e89-8e50-1f44227549d4 after 754.723911ms
I1004 20:44:38.361182 1 route_controller.go:313] Patching node status i-03466cd9918eb7781 with true previous condition was:nil
I1004 20:44:38.361301 1 route_controller.go:313] Patching node status i-06ac28fc2ced86895 with true previous condition was:nil
I1004 20:44:47.719493 1 route_controller.go:304] set node i-06ac28fc2ced86895 with NodeNetworkUnavailable=false was canceled because it is already set
从那时起,此错误每分钟大约 10 次向 aws-cloud-controller 日志发送垃圾邮件:
I1004 22:53:26.151322 1 route_controller.go:304] set node i-01c013ae44a04b63b with NodeNetworkUnavailable=false was canceled because it is already set
我该怎么做才能解决这个问题?我已经终止了实例并更新了集群,因此kops
会重新创建它们,但没有成功。也许我可以设置NodeNetworkUnavailable
为 true,这样控制器就可以将其设置为 false,但我不知道该怎么做,也不知道这是否真的有意义(一些消息来源也说这是为了避免改变节点的状态)。