我尝试使用 kubeadm 安装 Kubernetes Cluster v1.26(3 个节点 - Rocky 9),但遇到了与 kubelet 相关的问题。我已按照此步骤操作教程与官方的kubernetes集群安装并行。
kubelet安装完成后,kubelet的状态为:
systemctl 状态 kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Mon 2023-03-27 19:45:05 EEST; 8s ago
Docs: https://kubernetes.io/docs/
Process: 17757 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/>
Main PID: 17757 (code=exited, status=1/FAILURE)
CPU: 379ms
Mar 27 19:45:05 node01 systemd[1]: kubelet.service: Failed with result 'exit-code'.
在 journalctl 中我有以下消息:
> Mar 27 19:34:40 node01 kubelet[13832]: E0327 19:34:40.638950 13832 run.go:74] "command failed" err="failed to validate kubelet flags: the container runtime endpoint address was not specified or empty, use --container-runtime-endpoint>
Mar 27 19:34:40 node01 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
我还添加了 kubelet 服务
> Environment="KUBELET_EXTRA_ARGS=--container-runtime-endpoint=unix:///run/containerd/containerd.sock"
kubelet --container-runtime-endpoint=unix:///run/containerd/containerd.sock
I0327 19:52:34.967346 20609 server.go:412] "Kubelet version" kubeletVersion="v1.26.3"
> I0327 19:52:34.967701 20609 server.go:414] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
> I0327 19:52:34.968911 20609 server.go:575] "Standalone mode, no API client"
> I0327 19:52:34.987049 20609 server.go:463] "No api server defined - no events will be sent to API server"
> I0327 19:52:34.987281 20609 server.go:659] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"
> I0327 19:52:34.989143 20609 container_manager_linux.go:267] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
> I0327 19:52:34.989474 20609 container_manager_linux.go:272] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: KubeletOOMScoreAdj:-999 ContainerRuntime: CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[]} QOSReserved:map[] CPUManagerPolicy:none CPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container CPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none ExperimentalTopologyManagerPolicyOptions:map[]}
> I0327 19:52:34.989545 20609 topology_manager.go:134] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
> I0327 19:52:34.989583 20609 container_manager_linux.go:308] "Creating device plugin manager"
> I0327 19:52:34.990346 20609 state_mem.go:36] "Initialized new in-memory state store"
> I0327 19:52:35.028186 20609 kubelet.go:404] "Kubelet is running in standalone mode, will skip API server sync"
> I0327 19:52:35.034468 20609 kuberuntime_manager.go:244] "Container runtime initialized" containerRuntime="containerd" version="1.6.19" apiVersion="v1"
> I0327 19:52:35.036405 20609 volume_host.go:75] "KubeClient is nil. Skip initialization of CSIDriverLister"
> W0327 19:52:35.037048 20609 csi_plugin.go:189] kubernetes.io/csi: kubeclient not set, assuming standalone kubelet
> W0327 19:52:35.037060 20609 csi_plugin.go:266] Skipping CSINode initialization, kubelet running in standalone mode
> I0327 19:52:35.039128 20609 server.go:1186] "Started kubelet"
> I0327 19:52:35.040285 20609 kubelet.go:1502] "No API server defined - no node status update will be sent"
> I0327 19:52:35.040699 20609 server.go:161] "Starting to listen" address="0.0.0.0" port=10250
> I0327 19:52:35.049705 20609 server.go:451] "Adding debug handlers to kubelet server"
> I0327 19:52:35.042056 20609 fs_resource_analyzer.go:67] "Starting FS ResourceAnalyzer"
> I0327 19:52:35.043346 20609 server.go:193] "Starting to listen read-only" address="0.0.0.0" port=10255
> I0327 19:52:35.058042 20609 volume_manager.go:293] "Starting Kubelet Volume Manager"
> I0327 19:52:35.058088 20609 desired_state_of_world_populator.go:151] "Desired state populator starts to run"
> E0327 19:52:35.066359 20609 cri_stats_provider.go:455] "Failed to get the info of the filesystem with mountpoint" err="unable to find data in memory cache" mountpoint="/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs"
> E0327 19:52:35.066385 20609 kubelet.go:1386] "Image garbage collection failed once. Stats initialization may not have completed yet" err="invalid capacity 0 on image filesystem"
> I0327 19:52:35.157984 20609 cpu_manager.go:214] "Starting CPU manager" policy="none"
> I0327 19:52:35.158169 20609 cpu_manager.go:215] "Reconciling" reconcilePeriod="10s"
> I0327 19:52:35.158206 20609 state_mem.go:36] "Initialized new in-memory state store"
> I0327 19:52:35.160788 20609 desired_state_of_world_populator.go:159] "Finished populating initial desired state of world"
> I0327 19:52:35.169791 20609 state_mem.go:88] "Updated default CPUSet" cpuSet=""
> I0327 19:52:35.170010 20609 state_mem.go:96] "Updated CPUSet assignments" assignments=map[]
> I0327 19:52:35.170034 20609 policy_none.go:49] "None policy: Start"
> I0327 19:52:35.176881 20609 memory_manager.go:169] "Starting memorymanager" policy="None"
> I0327 19:52:35.177054 20609 state_mem.go:35] "Initializing new in-memory state store"
> I0327 19:52:35.180146 20609 state_mem.go:75] "Updated machine memory state"
> I0327 19:52:35.201730 20609 manager.go:455] "Failed to read data from checkpoint" checkpoint="kubelet_internal_checkpoint" err="checkpoint is not found"
> I0327 19:52:35.203049 20609 plugin_manager.go:118] "Starting Kubelet Plugin Manager"
> I0327 19:52:35.261014 20609 reconciler.go:41] "Reconciler: start to sync state"
> I0327 19:52:35.788856 20609 kubelet_network_linux.go:63] "Initialized iptables rules." protocol=IPv4
> I0327 19:52:36.104318 20609 kubelet_network_linux.go:63] "Initialized iptables rules." protocol=IPv6
> I0327 19:52:36.104388 20609 status_manager.go:172] "Kubernetes client is nil, not starting status manager"
> I0327 19:52:36.104417 20609 kubelet.go:2113] "Starting kubelet main sync loop"
> E0327 19:52:36.105012 20609 kubelet.go:2137] "Skipping pod synchronization" err="PLEG is not healthy: pleg has yet to be successful"
有什么想法吗?谢谢!
答案1
我设法启动了集群。我做的第一件事是将--container-runtime-endpoint=unix:///run/containerd/containerd.sock
/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf 文件直接放入 ExecStart 中。
ExecStart=/usr/bin/kubelet --container-runtime-endpoint=unix:///run/containerd/containerd.sock $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
这使得错误消失但又出现了另一个错误。
E0409 18:11:13.071952 113674 run.go:74]“命令失败”err =“无法加载 kubelet 配置文件,错误:无法加载 Kubelet 配置文件 /var/lib/kubelet/config.yaml,错误无法读取 kubelet 配置文件“/var/lib/kubelet/config.yaml”,错误:打开 /var/lib/kubelet/config.yaml:没有此文件或目录,路径:/var/lib/kubelet/config.yaml”
我忽略了这一点,继续执行 kubeadm init。Kubeadm init 成功完成,但在 kubelet 中,我看到很多连接被拒绝。我猜那是因为 kube-apiserver 正在重新启动。因此,我决定使用默认值重新配置 containerdcontainerd config default > /etc/containerd/config.toml
并仅修改SystemdCgroup = true
。我重新启动了 containerd 和 kubelet,一切看起来都正常了。
在我应用 CNI 之后,我看到另一个错误(在 weave-init 容器中的 weave pod 中)
iptables v1.8.3(旧版):无法初始化 iptables 表“filter”:表不存在(您需要 insmod 吗?)也许 iptables 或您的内核需要升级。
我通过运行以下命令重新加载了 ip_tables 内核模块:
modprobe ip_tables
现在我的集群正常,我希望我没有错过任何步骤。
答案2
问题在于 containerd 配置。
默认配置已完全禁用 cri,但仅启用它是不够的。
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd-systemd
这个烦人的设置需要修复
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true