我正在使用 kubeadm 创建双栈 Kubernetes 集群,并安装 Calico。我正在使用以下 kubeadm 配置文件:
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 0.0.0.0
bindPort: 6443
nodeRegistration:
criSocket: "unix:///var/run/containerd/containerd.sock"
---
kind: ClusterConfiguration
apiVersion: kubeadm.k8s.io/v1beta3
kubernetesVersion: v1.27.1
controlPlaneEndpoint: "{{ cp_endpoint }}:6443"
networking:
serviceSubnet: "10.96.0.0/16,2a12:f840:42:1::/112"
podSubnet: "10.244.0.0/14,2a12:f840:1:1::/56"
dnsDomain: "cluster.local"
---
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
cgroupDriver: systemd
集群启动,节点被标记为就绪。但是,CoreDNS 和 Calico Kube Controllers pod 始终未就绪。查看以下输出kubectl get pods -A -o wide below
:
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
calico-system calico-kube-controllers-789dc4c76b-tw2gp 0/1 Running 5 (2m18s ago) 7m20s 2a12:f840:1:9d:d490:4faf:378d:fd03 ip-10-0-1-114 <none> <none>
calico-system calico-node-dvkcg 1/1 Running 0 7m20s 10.0.1.114 ip-10-0-1-114 <none> <none>
calico-system calico-typha-7578549c55-wlk6f 1/1 Running 0 7m20s 10.0.1.114 ip-10-0-1-114 <none> <none>
calico-system csi-node-driver-vwz2h 2/2 Running 0 7m20s 2a12:f840:1:9d:d490:4faf:378d:fd00 ip-10-0-1-114 <none> <none>
kube-system coredns-5d78c9869d-fwc5g 0/1 Running 0 7m27s 2a12:f840:1:9d:d490:4faf:378d:fd01 ip-10-0-1-114 <none> <none>
kube-system coredns-5d78c9869d-r98d6 0/1 Running 0 7m27s 2a12:f840:1:9d:d490:4faf:378d:fd02 ip-10-0-1-114 <none> <none>
kube-system etcd-ip-10-0-1-114 1/1 Running 0 7m42s 10.0.1.114 ip-10-0-1-114 <none> <none>
kube-system kube-apiserver-ip-10-0-1-114 1/1 Running 0 7m42s 10.0.1.114 ip-10-0-1-114 <none> <none>
kube-system kube-controller-manager-ip-10-0-1-114 1/1 Running 0 7m43s 10.0.1.114 ip-10-0-1-114 <none> <none>
kube-system kube-proxy-hlq74 1/1 Running 0 7m27s 10.0.1.114 ip-10-0-1-114 <none> <none>
kube-system kube-scheduler-ip-10-0-1-114 1/1 Running 0 7m42s 10.0.1.114 ip-10-0-1-114 <none> <none>
tigera-operator tigera-operator-549d4f9bdb-c2c8m 1/1 Running 0 7m27s 10.0.1.114 ip-10-0-1-114 <none> <none>
检查 CoreDNS pod 的日志时,我收到以下错误:
[WARNING] plugin/kubernetes: Kubernetes API connection failure: Get "https://10.96.0.1:443/version": dial tcp 10.96.0.1:443: connect: network is unreachable
[INFO] plugin/ready: Still waiting on: "kubernetes"
在 Calico Kube Controllers pod 的日志中也看到了类似的错误:
2023-06-15 10:17:18.315 [ERROR][1] client.go 290: Error getting cluster information config ClusterInformation="default" error=Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.96.0.1:443: connect: network is unreachable
2023-06-15 10:17:18.315 [INFO][1] main.go 138: Failed to initialize datastore error=Get "https://10.96.0.1:443/apis/crd.projectcalico.org/v1/clusterinformations/default": dial tcp 10.96.0.1:443: connect: network is unreachable
这些 pod 接收 IPv6 地址,当检查主要 kube 系统服务(例如 API 控制器、DNS 等)时,它们都只有 IPv4 地址。我的假设是核心服务的 pod 和服务都应该是完全双栈的。但是,除了我已经添加的配置选项之外,我不知道还有哪些配置选项可以启用此功能。
下面的 kubernetes 服务的描述似乎支持这一点:kubectl describe service kubernetes
Name: kubernetes
Namespace: default
Labels: component=apiserver
provider=kubernetes
Annotations: <none>
Selector: <none>
Type: ClusterIP
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.96.0.1
IPs: 10.96.0.1
Port: https 443/TCP
TargetPort: 6443/TCP
Endpoints: 10.0.2.167:6443
Session Affinity: None
Events: <none>
我尝试将以下内容添加到 ClusterConfiguration,但没有产生任何影响。
controllerManager:
extraArgs:
cluster-cidr: "10.244.0.0/14,2a12:f840:1:1::/56"
service-cluster-ip-range: "10.96.0.0/16,2a12:f840:42:1::/112"
我还尝试使用以下命令直接在配置中注册节点 IP 地址:
kubeletExtraArgs:
node-ip: 10.0.2.167,2a05:d01c:345:dc03:aeb6:5cde:e434:1c34
这确实在节点描述中列出了两个 IP 地址,但对连接没有影响:
Addresses:
InternalIP: 10.0.2.167
InternalIP: 2a05:d01c:345:dc03:aeb6:5cde:e434:1c34