我在四台运行 raspberrypi OS 11 的 raspberrypi 上安装了一个 k8s 1.23.3 集群(靶心)arm64;主要是通过以下本指南.
要点是使用此命令创建控制平面
kubeadm init --token={some_token} --kubernetes-version=v1.23.3 --pod-network-cidr=10.1.0.0/16 --service-cidr=10.11.0.0/16 --control-plane-endpoint=10.0.4.16 --node-name=rpi-1-1
然后我创建了自己的kube-verify
命名空间,部署了回显服务器并为其创建了一个服务。
然而,我无法从任何节点访问该服务的集群 IP。为什么?请求只是超时,而对 pod 集群 IP 的请求则正常工作。
我怀疑我的kube-proxy
系统没有正常工作。以下是我迄今为止调查的结果。
$ kubectl get services -n kube-verify -o=wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
echo-server ClusterIP 10.11.213.180 <none> 8080/TCP 24h app=echo-server
$ kubectl get pods -n kube-system -o=wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
coredns-64897985d-47gpr 1/1 Running 1 (69m ago) 41h 10.1.0.5 rpi-1-1 <none> <none>
coredns-64897985d-nf55w 1/1 Running 1 (69m ago) 41h 10.1.0.4 rpi-1-1 <none> <none>
etcd-rpi-1-1 1/1 Running 2 (69m ago) 41h 10.0.4.16 rpi-1-1 <none> <none>
kube-apiserver-rpi-1-1 1/1 Running 2 (69m ago) 41h 10.0.4.16 rpi-1-1 <none> <none>
kube-controller-manager-rpi-1-1 1/1 Running 2 (69m ago) 41h 10.0.4.16 rpi-1-1 <none> <none>
kube-flannel-ds-5467m 1/1 Running 1 (69m ago) 28h 10.0.4.17 rpi-1-2 <none> <none>
kube-flannel-ds-7wpvz 1/1 Running 1 (69m ago) 28h 10.0.4.18 rpi-1-3 <none> <none>
kube-flannel-ds-9chxk 1/1 Running 1 (69m ago) 28h 10.0.4.19 rpi-1-4 <none> <none>
kube-flannel-ds-x5rvx 1/1 Running 1 (69m ago) 29h 10.0.4.16 rpi-1-1 <none> <none>
kube-proxy-8bbjn 1/1 Running 1 (69m ago) 28h 10.0.4.17 rpi-1-2 <none> <none>
kube-proxy-dw45d 1/1 Running 1 (69m ago) 28h 10.0.4.18 rpi-1-3 <none> <none>
kube-proxy-gkkxq 1/1 Running 2 (69m ago) 41h 10.0.4.16 rpi-1-1 <none> <none>
kube-proxy-ntl5w 1/1 Running 1 (69m ago) 28h 10.0.4.19 rpi-1-4 <none> <none>
kube-scheduler-rpi-1-1 1/1 Running 2 (69m ago) 41h 10.0.4.16 rpi-1-1 <none> <none>
$ kubectl logs kube-proxy-gkkxq -n kube-system
I0220 13:52:02.281289 1 node.go:163] Successfully retrieved node IP: 10.0.4.16
I0220 13:52:02.281535 1 server_others.go:138] "Detected node IP" address="10.0.4.16"
I0220 13:52:02.281610 1 server_others.go:561] "Unknown proxy mode, assuming iptables proxy" proxyMode=""
I0220 13:52:02.604880 1 server_others.go:206] "Using iptables Proxier"
I0220 13:52:02.604966 1 server_others.go:213] "kube-proxy running in dual-stack mode" ipFamily=IPv4
I0220 13:52:02.605026 1 server_others.go:214] "Creating dualStackProxier for iptables"
I0220 13:52:02.605151 1 server_others.go:491] "Detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined, , defaulting to no-op detect-local for IPv6"
I0220 13:52:02.606905 1 server.go:656] "Version info" version="v1.23.3"
W0220 13:52:02.614777 1 sysinfo.go:203] Nodes topology is not available, providing CPU topology
I0220 13:52:02.619535 1 conntrack.go:52] "Setting nf_conntrack_max" nf_conntrack_max=131072
I0220 13:52:02.620869 1 conntrack.go:100] "Set sysctl" entry="net/netfilter/nf_conntrack_tcp_timeout_close_wait" value=3600
I0220 13:52:02.660947 1 config.go:317] "Starting service config controller"
I0220 13:52:02.661015 1 shared_informer.go:240] Waiting for caches to sync for service config
I0220 13:52:02.662669 1 config.go:226] "Starting endpoint slice config controller"
I0220 13:52:02.662726 1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
I0220 13:52:02.762734 1 shared_informer.go:247] Caches are synced for service config
I0220 13:52:02.762834 1 shared_informer.go:247] Caches are synced for endpoint slice config
我在这里注意到的是Nodes topology is not available
,所以我进一步深入研究了 kube-proxy 配置,但没有什么特别之处。
如果我的集群中的节点拓扑确实存在问题,请向我提供一些有关如何解决此问题的资源,因为我无法根据此错误消息找到任何有意义的信息。
$ kubectl describe configmap kube-proxy -n kube-system
Name: kube-proxy
Namespace: kube-system
Labels: app=kube-proxy
Annotations: kubeadm.kubernetes.io/component-config.hash: sha256:edce433d45f2ed3a58ee400690184ad033594e8275fdbf52e9c8c852caa7124d
Data
====
config.conf:
----
apiVersion: kubeproxy.config.k8s.io/v1alpha1
bindAddress: 0.0.0.0
bindAddressHardFail: false
clientConnection:
acceptContentTypes: ""
burst: 0
contentType: ""
kubeconfig: /var/lib/kube-proxy/kubeconfig.conf
qps: 0
clusterCIDR: 10.1.0.0/16
configSyncPeriod: 0s
conntrack:
maxPerCore: null
min: null
tcpCloseWaitTimeout: null
tcpEstablishedTimeout: null
detectLocalMode: ""
enableProfiling: false
healthzBindAddress: ""
hostnameOverride: ""
iptables:
masqueradeAll: false
masqueradeBit: null
minSyncPeriod: 0s
syncPeriod: 0s
ipvs:
excludeCIDRs: null
minSyncPeriod: 0s
scheduler: ""
strictARP: false
syncPeriod: 0s
tcpFinTimeout: 0s
tcpTimeout: 0s
udpTimeout: 0s
kind: KubeProxyConfiguration
metricsBindAddress: ""
mode: ""
nodePortAddresses: null
oomScoreAdj: null
portRange: ""
showHiddenMetricsForVersion: ""
udpIdleTimeout: 0s
winkernel:
enableDSR: false
networkName: ""
sourceVip: ""
kubeconfig.conf:
----
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
server: https://10.0.4.16:6443
name: default
contexts:
- context:
cluster: default
namespace: default
user: default
name: default
current-context: default
users:
- name: default
user:
tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
BinaryData
====
Events: <none>
$ kubectl -n kube-system exec kube-proxy-gkkxq cat /var/lib/kube-proxy/kubeconfig.conf
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
server: https://10.0.4.16:6443
name: default
contexts:
- context:
cluster: default
namespace: default
user: default
name: default
current-context: default
users:
- name: default
user:
tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
这mode
默认为iptables
,如上面的日志所证实。
我还在所有节点上启用了 IP 转发。
$ sudo sysctl net.ipv4.ip_forward
net.ipv4.ip_forward = 1
答案1
绒布可以通过从存储库应用清单来安装。
flannel
Flannel 可以添加到任何现有的 Kubernetes 集群中,不过最简单的方法是在使用 pod 网络的任何 pod 启动之前 添加 。对于 Kubernetes v1.17+kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml
正如您在此文件中看到的,yaml
默认情况下网络子网设置为10.244.0.0/16
net-conf.json: |
{
"Network": "10.244.0.0/16",
"Backend": {
"Type": "vxlan"
}
kubeadm init
是命令要初始化集群,需要为集群网络指定一个子网,并且它需要与 CNI 中的子网相同。您可以查看更多选项。
--pod-network-cidr 字符串 指定 pod 网络的 IP 地址范围。如果设置,控制平面将自动为每个节点分配 CIDR。
您初始化了一个集群,--pod-network-cidr=10.1.0.0/16
并且集群的子网设置为与 flannel 清单的 yaml 文件中的子网不同"10.244.0.0/16"
,这就是它不起作用的原因。
有两个选项可以修复它:
第一 - 将 flannel 配置 yaml 中的子网更改为与集群初始化时应用的子网相同,在本例中是--pod-network-cidr=10.1.0.0/16
(参见下面的脚本)
或者
第二 - 如果集群用于测试目的并且刚刚初始化,则销毁一个集群并从与 flannel 配置 yaml 相同的子网开始"Network": "10.244.0.0/16"
为了自动修改kube-flannel.yml
,以下脚本基于yq
和jq
可以使用命令:
#!/bin/bash
input=$1
output=$2
echo "Converting $input to $output"
netconf=$( yq '. | select(.kind == "ConfigMap") | select(.metadata.name == "kube-flannel-cfg") | .data."net-conf.json"' "$input" | jq 'fromjson | .Network="10.1.0.0/16"' | yq -R '.' )
kube_flannel_cfg=$( yq --yaml-output '. | select(.kind == "ConfigMap") | select(.metadata.name == "kube-flannel-cfg") | .data."net-conf.json"='"$netconf" "$input" )
everything_else=$( yq --yaml-output '. | select(.kind != "ConfigMap") | select(.metadata.name != "kube-flannel-cfg")' "$input" )
echo "$kube_flannel_cfg" >> "$output"
echo '---' >> "$output"
echo "$everything_else" >> "$output"