我正在尝试设置一个小型的 4 个工作节点集群,我刚刚在我的 Raspberry Pi 4s(8GB)上安装了 k3s,然后我得到了一个NotReady
状态。我是 kubernettes/k3s 的新手,但我相信通过全新安装,一切应该“正常工作”。我有一个全新擦除并安装的 Ubuntu 22.04 服务器,适用于 64 位 arm。由于终端输出太长,我有一个 pastbin这里。看起来主节点上的 pod 无法挂载卷,也无法创建沙盒。此外,我还遇到了 apiserver 问题,我认为这与这些挂载和沙盒错误有关,因为经过几次尝试后,apiserver 最终会做出响应。所以我想,这到底是怎么回事。有人能帮我解释一下吗?为什么我的主节点难以挂载卷?我该如何开始解决这个问题?
zeus@atlas00:~$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
atlas04 NotReady <none> 7h32m v1.23.6+k3s1
atlas08 NotReady <none> 7h36m v1.23.6+k3s1
atlas06 NotReady <none> 7h36m v1.23.6+k3s1
atlas02 Ready <none> 7h32m v1.23.6+k3s1
atlas00 NotReady control-plane,master 8h v1.23.6+k3s1
zeus@atlas00:~$ kubectl get pods -n kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
helm-install-traefik-qzxlm 0/1 ContainerCreating 0 8h <none> atlas00 <none> <none>
local-path-provisioner-6c79684f77-bb9bn 0/1 Pending 0 8h <none> <none> <none> <none>
helm-install-traefik-crd-tg52k 0/1 ContainerCreating 0 8h <none> atlas00 <none> <none>
metrics-server-7cd5fcb6b7-qz88k 0/1 Pending 0 8h <none> <none> <none> <none>
coredns-d76bd69b-9dzpc 0/1 ContainerCreating 0 8h <none> atlas00 <none> <none>
zeus@atlas00:~$ kubectl describe pod helm-install-traefik-qzxlm -n kube-system
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
zeus@atlas00:~$ kubectl describe pod helm-install-traefik-qzxlm -n kube-system
The connection to the server 127.0.0.1:6443 was refused - did you specify the right host or port?
zeus@atlas00:~$ kubectl describe pod helm-install-traefik-qzxlm -n kube-system
Error from server (InternalError): an error on the server ("apiserver not ready") has prevented the request from succeeding (get pods helm-install-traefik-qzxlm)
zeus@atlas00:~$ kubectl describe pod helm-install-traefik-qzxlm -n kube-system
Name: helm-install-traefik-qzxlm
Namespace: kube-system
Priority: 0
Node: atlas00/192.168.1.50
Start Time: Tue, 24 May 2022 08:07:56 +0000
Labels: controller-uid=1f431fba-cb3a-45cc-880a-5be734db988e
helmcharts.helm.cattle.io/chart=traefik
job-name=helm-install-traefik
Annotations: helmcharts.helm.cattle.io/configHash: SHA256=8BE6F0CEB108C2A3A1EC5A8F7591596C00670380ACEA294775E4769C94AEE7A2
Status: Pending
IP:
IPs: <none>
Controlled By: Job/helm-install-traefik
Containers:
helm:
Container ID:
Image: rancher/klipper-helm:v0.7.1-build20220407
Image ID:
Port: <none>
Host Port: <none>
Args:
install
--set-string
global.systemDefaultRegistry=
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment:
NAME: traefik
VERSION:
REPO:
HELM_DRIVER: secret
CHART_NAMESPACE: kube-system
CHART: https://%{KUBERNETES_API}%/static/charts/traefik-10.19.300.tgz
HELM_VERSION:
TARGET_NAMESPACE: kube-system
NO_PROXY: .svc,.cluster.local,10.42.0.0/16,10.43.0.0/16
FAILURE_POLICY: reinstall
Mounts:
/chart from content (rw)
/config from values (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-f9qlx (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
values:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: chart-values-traefik
Optional: false
content:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: chart-content-traefik
Optional: false
kube-api-access-f9qlx:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 8h default-scheduler Successfully assigned kube-system/helm-install-traefik-qzxlm to atlas00
Warning FailedMount 8h kubelet MountVolume.SetUp failed for volume "content" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 8h kubelet MountVolume.SetUp failed for volume "kube-api-access-f9qlx" : failed to fetch token: serviceaccounts "helm-traefik" is forbidden: User "system:node:atlas00" cannot create resource "serviceaccounts/token" in API group "" in the namespace "kube-system": no relationship found between node 'atlas00' and this object
Warning FailedMount 8h (x2 over 8h) kubelet MountVolume.SetUp failed for volume "values" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h53m kubelet MountVolume.SetUp failed for volume "values" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h53m kubelet MountVolume.SetUp failed for volume "kube-api-access-f9qlx" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h53m (x2 over 7h53m) kubelet MountVolume.SetUp failed for volume "content" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h52m kubelet MountVolume.SetUp failed for volume "values" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h52m (x2 over 7h52m) kubelet MountVolume.SetUp failed for volume "content" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h52m kubelet MountVolume.SetUp failed for volume "kube-api-access-f9qlx" : failed to fetch token: serviceaccounts "helm-traefik" is forbidden: User "system:node:atlas00" cannot create resource "serviceaccounts/token" in API group "" in the namespace "kube-system": no relationship found between node 'atlas00' and this object
Warning FailedMount 7h41m kubelet MountVolume.SetUp failed for volume "content" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h41m kubelet MountVolume.SetUp failed for volume "values" : failed to sync configmap cache: timed out waiting for the condition
Warning FailedMount 7h41m kubelet MountVolume.SetUp failed for volume "kube-api-access-f9qlx" : failed to fetch token: serviceaccounts "helm-traefik" is forbidden: User "system:node:atlas00" cannot create resource "serviceaccounts/token" in API group "" in the namespace "kube-system": no relationship found between node 'atlas00' and this object
Warning FailedCreatePodSandBox 7h40m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to get sandbox image "rancher/mirrored-pause:3.6": failed to pull image "rancher/mirrored-pause:3.6": failed to pull and unpack image "docker.io/rancher/mirrored-pause:3.6": failed to prepare extraction snapshot "extract-476722526-09RL sha256:c640e628658788773e4478ae837822c9bc7db5b512442f54286a98ad50f88fd4": failed to rename: rename /var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/new-2732139020 /var/lib/rancher/k3s/agent/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/4: file exists
Warning FailedCreatePodSandBox 6h54m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b9f0346aa924105c7c3498ecb6315c32e13d4237eaa062cea2926401ba1c0ab6": plugin type="flannel" failed (add): open /run/flannel/subnet.env: no such file or directory
Warning FailedCreatePodSandBox 6h42m kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "41b66aa473ffaee3ae32567c0ff2fe233f35569ea15b3301cfab127e92efce69": plugin type="flannel" failed (add): open /run/flannel/subnet.env: no such file or directory
答案1
您是在代理后面还是在隔离环境中运行集群?如果是这样,事件“FailedCreatePodSandBox”和日志“无法拉取映像...”可能是因为您没有正确设置注册表镜像。
如果您使用 docker 运行,请添加到您的 /etc/docker/daemon.json:...“registry-mirrors”:[“https://”]...
如果您直接使用 containerd,请添加到您的 registries.yaml:.... 镜像:mycustomreg.com:端点: - “https://mycustomreg.com:5000”....