我创建了一个像这样的 K3s 多主嵌入式集群:
主机名:k3s01
curl -sfL https://get.k3s.io | K3S_TOKEN=xxx INSTALL_K3S_EXEC="server --disable servicelb --disable traefik --bind-address=10.0.0.4 --tls-san 10.0.0.4 --node-external-ip=168.119.x.x --node-ip=10.0.0.4 --flannel-iface=enp7s0 --advertise-address=PUBIP-OF-LB --cluster-init" sh -
主机名:k8s02
curl -sfL https://get.k3s.io | K3S_TOKEN=xxx INSTALL_K3S_EXEC="server --disable servicelb --disable traefik --bind-address=10.0.0.2 --tls-san 10.0.0.2 --node-ip 10.0.0.2 --node-external-ip=168.119.x.x --flannel-iface=enp7s0 --server=https://10.0.0.4:6443" sh -
主机名:k8s03
curl -sfL https://get.k3s.io | K3S_TOKEN=xxx INSTALL_K3S_EXEC="server --disable servicelb --disable traefik --bind-address=10.0.0.3 --tls-san 10.0.0.3 --node-ip 10.0.0.3 --node-external-ip=168.119.x.x --flannel-iface=enp7s0 --server=https://10.0.0.4:6443" sh -
我可以通过 LB-IP 从本地机器使用 kubectl 进行连接!LB:tcp 6443 -> 6443
我也可以从上述任何节点中使用 kubectl。我为 Hetzner 部署了 CSI,效果也很好。使用他们的测试部署进行了测试!
然而,在完成所有这些之后(到目前为止工作正常),我尝试安装 ingress-nginx。部署开始时没有任何问题。但我发现在集群内部与 apiserver 通信时存在问题,如 ingress-nginx-controller 的以下日志所示:
E1204 11:42:25.216392 8 leaderelection.go:321] error retrieving resource lock ingress-nginx/ingress-controller-leader-nginx: Get "https://10.43.0.1:443/api/v1/namespaces/ingress-nginx/configmaps/ingress-controller-leader-nginx": dial tcp 10.43.0.1:443: connect: connection refused
嗯,奇怪!好吧,让我们做一些检查:
kubectl get svc kubernetes -o yaml
apiVersion: v1
kind: Service
metadata:
creationTimestamp: "2020-12-04T11:22:25Z"
labels:
component: apiserver
provider: kubernetes
managedFields:
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
.: {}
f:component: {}
f:provider: {}
f:spec:
f:clusterIP: {}
f:ports:
.: {}
k:{"port":443,"protocol":"TCP"}:
.: {}
f:name: {}
f:port: {}
f:protocol: {}
f:targetPort: {}
f:sessionAffinity: {}
f:type: {}
manager: k3s
operation: Update
time: "2020-12-04T11:22:25Z"
name: kubernetes
namespace: default
resourceVersion: "10434"
selfLink: /api/v1/namespaces/default/services/kubernetes
uid: f0993556-3b7f-40aa-a293-45170cb03002
spec:
clusterIP: 10.43.0.1
ports:
- name: https
port: 443
protocol: TCP
targetPort: 6443
sessionAffinity: None
type: ClusterIP
status:
loadBalancer: {}
看起来不错。
kubectl get endpoints -o yaml
apiVersion: v1
items:
- apiVersion: v1
kind: Endpoints
metadata:
creationTimestamp: "2020-12-04T11:22:25Z"
labels:
endpointslice.kubernetes.io/skip-mirror: "true"
managedFields:
- apiVersion: v1
fieldsType: FieldsV1
fieldsV1:
f:metadata:
f:labels:
.: {}
f:endpointslice.kubernetes.io/skip-mirror: {}
f:subsets: {}
manager: k3s
operation: Update
time: "2020-12-04T11:23:39Z"
name: kubernetes
namespace: default
resourceVersion: "808"
selfLink: /api/v1/namespaces/default/endpoints/kubernetes
uid: cb450392-b4c9-4c2f-bfde-1a3b20ac4b5d
subsets:
- addresses:
- ip: 167.233.x.x
- ip: 168.119.x.x
- ip: 168.119.x.x
ports:
- name: https
port: 6443
protocol: TCP
kind: List
metadata:
resourceVersion: ""
selfLink: ""
好的,为什么 Pub IP 在这里?让我们从一个 pod 内部检查一下,以直接调用其中一个 IP:
kubectl exec -it ingress-controler-pod-xxxx -- bash
bash-5.0$ curl https://167.233.x.x:6443 --insecure
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "Unauthorized",
"reason": "Unauthorized",
"code": 401
}bash-5.0$ curl https://10.43.0.1:443
curl: (7) Failed to connect to 10.43.0.1 port 443: Connection refused
好吧..这很奇怪!
有时还会出现一些错误,例如:
Error from server: error dialing backend: dial tcp: lookup k8s02: Try again
当我尝试将 top exex 插入 pod 或显示日志时出现。只有当我尝试对另一台主机上运行的目标 pod 执行此操作时才会发生这种情况。
DNS 有问题吗?
cat /etc/resolv.conf
nameserver 127.0.0.53
options edns0 trust-ad
我无法通过主机名解析主机。但是我刚刚在 K3s 设置中指定了 IP。我的主机之间需要有效的 DNS 吗?我的 K3s 安装参数有问题吗?
答案1
我遇到了类似的问题,是由 DNS 解析配置错误引起的,请检查是否可以相互解析节点主机名。