Kubernetes API 服务器问题

Kubernetes API 服务器问题

尝试让仪表板 UI 在kubeadm用于远程访问的集群中工作kubectl proxy。获取

Error: 'dial tcp 192.168.2.3:8443: connect: connection refused'
Trying to reach: 'https://192.168.2.3:8443/'

http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/通过远程浏览器访问时。

查看 API 日志,我发现收到以下错误:

I1215 20:18:46.601151       1 log.go:172] http: TLS handshake error from 10.21.72.28:50268: remote error: tls: unknown certificate authority
I1215 20:19:15.444580       1 log.go:172] http: TLS handshake error from 10.21.72.28:50271: remote error: tls: unknown certificate authority
I1215 20:19:31.850501       1 log.go:172] http: TLS handshake error from 10.21.72.28:50275: remote error: tls: unknown certificate authority
I1215 20:55:55.574729       1 log.go:172] http: TLS handshake error from 10.21.72.28:50860: remote error: tls: unknown certificate authority
E1215 21:19:47.246642       1 watch.go:233] unable to encode watch object *v1.WatchEvent: write tcp 134.84.53.162:6443->134.84.53.163:38894: write: connection timed out (&streaming.encoder{writer:(*metrics.fancyResponseWriterDelegator)(0xc42d6fecb0), encoder:(*versioning.codec)(0xc429276990), buf:(*bytes.Buffer)(0xc42cae68c0)})

我推测这与无法使仪表板工作有关,如果是这样,我想知道 API 服务器的问题是什么。集群中的其他一切似乎都在正常工作。

注意,我已经在本地运行 admin.conf,并且能够通过 kubectl 访问集群,没有任何问题。

另外,值得注意的是,当我第一次启动集群时,这个功能一直有效。然而,我遇到了网络问题,必须应用这个功能才能让 CoreDNS 正常工作 Coredns 服务不起作用,但端点正常,除 dns 外,其他 SVC 均正常,所以我想知道这是否会破坏代理服务?

* 编辑 *

以下是仪表板窗格的输出:

[gms@thalia0 ~]$ kubectl describe pod kubernetes-dashboard-77fd78f978-tjzxt --namespace=kube-system
Name:               kubernetes-dashboard-77fd78f978-tjzxt
Namespace:          kube-system
Priority:           0
PriorityClassName:  <none>
Node:               thalia2.hostdoman/hostip<redacted>
Start Time:         Sat, 15 Dec 2018 15:17:57 -0600
Labels:             k8s-app=kubernetes-dashboard
                    pod-template-hash=77fd78f978
Annotations:        cni.projectcalico.org/podIP: 192.168.2.3/32
Status:             Running
IP:                 192.168.2.3
Controlled By:      ReplicaSet/kubernetes-dashboard-77fd78f978
Containers:
  kubernetes-dashboard:
    Container ID:  docker://ed5ff580fb7d7b649d2bd1734e5fd80f97c80dec5c8e3b2808d33b8f92e7b472
    Image:         k8s.gcr.io/kubernetes-dashboard-amd64:v1.10.0
    Image ID:      docker-pullable://k8s.gcr.io/kubernetes-dashboard-amd64@sha256:1d2e1229a918f4bc38b5a3f9f5f11302b3e71f8397b492afac7f273a0008776a
    Port:          8443/TCP
    Host Port:     0/TCP
    Args:
      --auto-generate-certificates
    State:          Running
      Started:      Sat, 15 Dec 2018 15:18:04 -0600
    Ready:          True
    Restart Count:  0
    Liveness:       http-get https://:8443/ delay=30s timeout=30s period=10s #success=1 #failure=3
    Environment:    <none>
    Mounts:
      /certs from kubernetes-dashboard-certs (rw)
      /tmp from tmp-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kubernetes-dashboard-token-mrd9k (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  kubernetes-dashboard-certs:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kubernetes-dashboard-certs
    Optional:    false
  tmp-volume:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
  kubernetes-dashboard-token-mrd9k:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  kubernetes-dashboard-token-mrd9k
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node-role.kubernetes.io/master:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:          <none>

我检查了服务:

[gms@thalia0 ~]$ kubectl -n kube-system get service kubernetes-dashboard
NAME                   TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
kubernetes-dashboard   ClusterIP   10.103.93.93   <none>        443/TCP   4d23h

还值得注意的是,如果我curl http://localhost:8001/api来自主节点,我确实会得到有效的响应。

因此,总而言之,我不确定这些错误中的哪一个是无法访问仪表板的根源。

答案1

我将集群中的所有节点升级到版本 1.13.1,现在仪表板可以正常工作了,而且到目前为止,我不需要应用上面提到的 CoreDNS 修复。

相关内容