我已经安装了多主集群,参考指南设置k8多主集群
设置细节如下。
负载均衡器:Haproxy LB
frontend kubernetes-frontend
bind 192.168.1.11:6443
mode tcp
option tcplog
default_backend kubernetes-backend
backend kubernetes-backend
mode tcp
option tcp-check
balance roundrobin
server master21.server 192.168.1.21:6443 check fall 3 rise 2
server master22.server 192.168.1.22:6443 check fall 3 rise 2
Kubernetes 版本:v1.25.0
No Of masters: 2
No of workers: 2
Docker 版本 23.0.1
cri-dockerd V3.0
环境:Vmware 虚拟服务器:Centos 8
安装和集群设置完成后,一切运行正常,我还部署了一个示例 pod。然后我想通过关闭其中一个主服务器来检查集群的高可用性,问题就来了,一旦我关闭其中一个主服务器,kubectl 命令就会停止工作。尝试重新启动和切换主节点,但 kubectl 命令不起作用。当命令超时时,它会给出以下错误(但并非总是如此)
error: Get "https://192.168.1.11:6443/api?timeout=32s": net/http: TLS handshake timeout - error from a previous attempt: EOF
我尝试过带有和不带有 http(s) 的 curl 命令,结果如下
[***@master21 ~]$ curl -v https://192.168.1.11:6443/api?timeout=32s
* Trying 192.168.1.11...
* TCP_NODELAY set
* Connected to 192.168.1.11 (192.168.1.11) port 6443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 192.168.1.11:6443
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to 192.168.1.11:6443
[***@master21 ~]$ curl -v http://192.168.1.11:6443/api?timeout=32s
* Trying 192.168.1.11...
* TCP_NODELAY set
* Connected to 192.168.1.11 (192.168.1.11) port 6443 (#0)
> GET /api?timeout=32s HTTP/1.1
> Host: 192.168.1.11:6443
> User-Agent: curl/7.61.1
> Accept: */*
>
* Empty reply from server
有人能帮我解决这个问题吗?我相信 haproxy 上需要 TLS 配置,但不了解如何配置它以与 k8 集群中现有的 SSL 设置相匹配
输出curl-kv-卷曲https://192.168.1.21:6443/healthz关闭一个主机(master22.server 整个虚拟机)
[***@master21 ~]$ curl -kv https://192.168.1.21:6443/healthz
* Trying 192.168.1.21...
* TCP_NODELAY set
* Connected to 192.168.1.21 (192.168.1.21) port 6443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/pki/tls/certs/ca-bundle.crt
CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Request CERT (13):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, [no content] (0):
* TLSv1.3 (OUT), TLS handshake, Certificate (11):
* TLSv1.3 (OUT), TLS handshake, [no content] (0):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_128_GCM_SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=kube-apiserver
* start date: Mar 23 08:10:26 2023 GMT
* expire date: Mar 22 08:10:26 2024 GMT
* issuer: CN=kubernetes
* SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* Using Stream ID: 1 (easy handle 0x5605644bf690)
* TLSv1.3 (OUT), TLS app data, [no content] (0):
> GET /healthz HTTP/2
> Host: 192.168.1.21:6443
> User-Agent: curl/7.61.1
> Accept: */*
>
* TLSv1.3 (IN), TLS handshake, [no content] (0):
* TLSv1.3 (IN), TLS handshake, Newsession Ticket (4):
* TLSv1.3 (IN), TLS app data, [no content] (0):
* Connection state changed (MAX_CONCURRENT_STREAMS == 250)!
* TLSv1.3 (OUT), TLS app data, [no content] (0):
* TLSv1.3 (IN), TLS app data, [no content] (0):
* TLSv1.3 (IN), TLS app data, [no content] (0):
< HTTP/2 403
< audit-id: 930660ff-c7ee-4226-9b98-8fdaed13a251
< cache-control: no-cache, private
< content-type: application/json
< x-content-type-options: nosniff
< x-kubernetes-pf-flowschema-uid:
< x-kubernetes-pf-prioritylevel-uid:
< content-length: 224
< date: Fri, 24 Mar 2023 06:46:01 GMT
<
* TLSv1.3 (IN), TLS app data, [no content] (0):
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "forbidden: User \"system:anonymous\" cannot get path \"/healthz\"",
"reason": "Forbidden",
"details": {},
"code": 403
* Connection #0 to host 192.168.1.21 left intact
经过进一步检查,我注意到问题发生在我完全关闭主节点(整个虚拟机)时,当我仅停止 kubelet 服务时,kubectl 命令给出以下预期输出
[***@master22 ~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
master21.server Ready control-plane 22h v1.25.0
master22.server NotReady control-plane 22h v1.25.0
worker31.server Ready <none> 22h v1.25.0
worker32.server Ready <none> 22h v1.25.0