Kubelet 错误 - 未指定容器运行时端点地址或为空

Kubelet 错误 - 未指定容器运行时端点地址或为空

我尝试使用 kubeadm 安装 Kubernetes Cluster v1.26(3 个节点 - Rocky 9),但遇到了与 kubelet 相关的问题。我已按照此步骤操作教程与官方的kubernetes集群安装并行。

kubelet安装完成后,kubelet的状态为:

systemctl 状态 kubelet

● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
    Drop-In: /usr/lib/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: activating (auto-restart) (Result: exit-code) since Mon 2023-03-27 19:45:05 EEST; 8s ago
       Docs: https://kubernetes.io/docs/
    Process: 17757 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/>
   Main PID: 17757 (code=exited, status=1/FAILURE)
        CPU: 379ms
Mar 27 19:45:05 node01 systemd[1]: kubelet.service: Failed with result 'exit-code'.

在 journalctl 中我有以下消息:

> Mar 27 19:34:40 node01 kubelet[13832]: E0327 19:34:40.638950   13832 run.go:74] "command failed" err="failed to validate kubelet flags: the container runtime endpoint address was not specified or empty, use --container-runtime-endpoint>
Mar 27 19:34:40 node01 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE

我还添加了 kubelet 服务

> Environment="KUBELET_EXTRA_ARGS=--container-runtime-endpoint=unix:///run/containerd/containerd.sock"

kubelet --container-runtime-endpoint=unix:///run/containerd/containerd.sock

I0327 19:52:34.967346   20609 server.go:412] "Kubelet version" kubeletVersion="v1.26.3"
> I0327 19:52:34.967701   20609 server.go:414] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
> I0327 19:52:34.968911   20609 server.go:575] "Standalone mode, no API client"
> I0327 19:52:34.987049   20609 server.go:463] "No api server defined - no events will be sent to API server"
> I0327 19:52:34.987281   20609 server.go:659] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
> I0327 19:52:34.989143   20609 container_manager_linux.go:267] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
> I0327 19:52:34.989474   20609 container_manager_linux.go:272] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: KubeletOOMScoreAdj:-999 ContainerRuntime: CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[]} QOSReserved:map[] CPUManagerPolicy:none CPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container CPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none ExperimentalTopologyManagerPolicyOptions:map[]}
> I0327 19:52:34.989545   20609 topology_manager.go:134] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
> I0327 19:52:34.989583   20609 container_manager_linux.go:308] "Creating device plugin manager"
> I0327 19:52:34.990346   20609 state_mem.go:36] "Initialized new in-memory state store"
> I0327 19:52:35.028186   20609 kubelet.go:404] "Kubelet is running in standalone mode, will skip API server sync"
> I0327 19:52:35.034468   20609 kuberuntime_manager.go:244] "Container runtime initialized" containerRuntime="containerd" version="1.6.19" apiVersion="v1"
> I0327 19:52:35.036405   20609 volume_host.go:75] "KubeClient is nil. Skip initialization of CSIDriverLister"
> W0327 19:52:35.037048   20609 csi_plugin.go:189] kubernetes.io/csi: kubeclient not set, assuming standalone kubelet
> W0327 19:52:35.037060   20609 csi_plugin.go:266] Skipping CSINode initialization, kubelet running in standalone mode
> I0327 19:52:35.039128   20609 server.go:1186] "Started kubelet"
> I0327 19:52:35.040285   20609 kubelet.go:1502] "No API server defined - no node status update will be sent"
> I0327 19:52:35.040699   20609 server.go:161] "Starting to listen" address="0.0.0.0" port=10250
> I0327 19:52:35.049705   20609 server.go:451] "Adding debug handlers to kubelet server"
> I0327 19:52:35.042056   20609 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
> I0327 19:52:35.043346   20609 server.go:193] "Starting to listen read-only" address="0.0.0.0" port=10255
> I0327 19:52:35.058042   20609 volume_manager.go:293] "Starting Kubelet Volume Manager"
> I0327 19:52:35.058088   20609 desired_state_of_world_populator.go:151] "Desired state populator starts to run"
> E0327 19:52:35.066359   20609 cri_stats_provider.go:455] "Failed to get the info of the filesystem with mountpoint" err="unable to find data in memory cache" mountpoint="/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs"
> E0327 19:52:35.066385   20609 kubelet.go:1386] "Image garbage collection failed once. Stats initialization may not have completed yet" err="invalid capacity 0 on image filesystem"
> I0327 19:52:35.157984   20609 cpu_manager.go:214] "Starting CPU manager" policy="none"
> I0327 19:52:35.158169   20609 cpu_manager.go:215] "Reconciling" reconcilePeriod="10s"
> I0327 19:52:35.158206   20609 state_mem.go:36] "Initialized new in-memory state store"
> I0327 19:52:35.160788   20609 desired_state_of_world_populator.go:159] "Finished populating initial desired state of world"
> I0327 19:52:35.169791   20609 state_mem.go:88] "Updated default CPUSet" cpuSet=""
> I0327 19:52:35.170010   20609 state_mem.go:96] "Updated CPUSet assignments" assignments=map[]
> I0327 19:52:35.170034   20609 policy_none.go:49] "None policy: Start"
> I0327 19:52:35.176881   20609 memory_manager.go:169] "Starting memorymanager" policy="None"
> I0327 19:52:35.177054   20609 state_mem.go:35] "Initializing new in-memory state store"
> I0327 19:52:35.180146   20609 state_mem.go:75] "Updated machine memory state"
> I0327 19:52:35.201730   20609 manager.go:455] "Failed to read data from checkpoint" checkpoint="kubelet_internal_checkpoint" err="checkpoint is not found"
> I0327 19:52:35.203049   20609 plugin_manager.go:118] "Starting Kubelet Plugin Manager"
> I0327 19:52:35.261014   20609 reconciler.go:41] "Reconciler: start to sync state"
> I0327 19:52:35.788856   20609 kubelet_network_linux.go:63] "Initialized iptables rules." protocol=IPv4
> I0327 19:52:36.104318   20609 kubelet_network_linux.go:63] "Initialized iptables rules." protocol=IPv6
> I0327 19:52:36.104388   20609 status_manager.go:172] "Kubernetes client is nil, not starting status manager"
> I0327 19:52:36.104417   20609 kubelet.go:2113] "Starting kubelet main sync loop"
> E0327 19:52:36.105012   20609 kubelet.go:2137] "Skipping pod synchronization" err="PLEG is not healthy: pleg has yet to be successful"

有什么想法吗?谢谢!

答案1

我设法启动了集群。我做的第一件事是将--container-runtime-endpoint=unix:///run/containerd/containerd.sock/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf 文件直接放入 ExecStart 中。

ExecStart=/usr/bin/kubelet --container-runtime-endpoint=unix:///run/containerd/containerd.sock $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS

这使得错误消失但又出现了另一个错误。

E0409 18:11:13.071952 113674 run.go:74]“命令失败”err =“无法加载 kubelet 配置文件,错误:无法加载 Kubelet 配置文件 /var/lib/kubelet/config.yaml,错误无法读取 kubelet 配置文件“/var/lib/kubelet/config.yaml”,错误:打开 /var/lib/kubelet/config.yaml:没有此文件或目录,路径:/var/lib/kubelet/config.yaml”

我忽略了这一点,继续执行 kubeadm init。Kubeadm init 成功完成,但在 kubelet 中,我看到很多连接被拒绝。我猜那是因为 kube-apiserver 正在重新启动。因此,我决定使用默认值重新配置 containerdcontainerd config default > /etc/containerd/config.toml并仅修改SystemdCgroup = true。我重新启动了 containerd 和 kubelet,一切看起来都正常了。

在我应用 CNI 之后,我看到另一个错误(在 weave-init 容器中的 weave pod 中)

iptables v1.8.3(旧版):无法初始化 iptables 表“filter”:表不存在(您需要 insmod 吗?)也许 iptables 或您的内核需要升级。

我通过运行以下命令重新加载了 ip_tables 内核模块:

modprobe ip_tables

现在我的集群正常,我希望我没有错过任何步骤。

答案2

问题在于 containerd 配置。

默认配置已完全禁用 cri,但仅启用它是不够的。

https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd-systemd

这个烦人的设置需要修复

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
  ...
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
    SystemdCgroup = true

相关内容