我正面临这个问题kubelet 错误在运行 CentOS 7 的 k8s 集群上。最近我重新启动集群节点后出现了此错误。我以前重新启动机器时没有遇到过类似的问题。
我尝试运行swapoff -a
以禁用交换,但问题并没有得到解决。
以下是systemctl status kubelet -l
日志:
[root@test-master ~]# systemctl status kubelet -l
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Mon 2022-05-30 13:59:51 +08; 822ms ago
Docs: https://kubernetes.io/docs/
Process: 9325 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE)
Main PID: 9325 (code=exited, status=1/FAILURE)
May 30 13:59:51 test-master kubelet[9325]: Insecure values: TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_ECDSA_WITH_RC4_128_SHA, TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256, TLS_ECDHE_RSA_WITH_RC4_128_SHA, TLS_RSA_WITH_3DES_EDE_CBC_SHA, TLS_RSA_WITH_AES_128_CBC_SHA256, TLS_RSA_WITH_RC4_128_SHA. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
May 30 13:59:51 test-master kubelet[9325]: --tls-min-version string Minimum TLS version supported. Possible values: VersionTLS10, VersionTLS11, VersionTLS12, VersionTLS13 (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
May 30 13:59:51 test-master kubelet[9325]: --tls-private-key-file string File containing x509 private key matching --tls-cert-file. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
May 30 13:59:51 test-master kubelet[9325]: --topology-manager-policy string Topology Manager policy to use. Possible values: 'none', 'best-effort', 'restricted', 'single-numa-node'. (default "none") (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
May 30 13:59:51 test-master kubelet[9325]: --topology-manager-scope string Scope to which topology hints applied. Topology Manager collects hints from Hint Providers and applies them to defined scope to ensure the pod admission. Possible values: 'container', 'pod'. (default "container") (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
May 30 13:59:51 test-master kubelet[9325]: -v, --v Level number for the log level verbosity
May 30 13:59:51 test-master kubelet[9325]: --version version[=true] Print version information and quit
May 30 13:59:51 test-master kubelet[9325]: --vmodule pattern=N,... comma-separated list of pattern=N settings for file-filtered logging (only works for text log format)
May 30 13:59:51 test-master kubelet[9325]: --volume-plugin-dir string The full path of the directory in which to search for additional third party volume plugins (default "/usr/libexec/kubernetes/kubelet-plugins/volume/exec/") (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
May 30 13:59:51 test-master kubelet[9325]: --volume-stats-agg-period duration Specifies interval for kubelet to calculate and cache the volume disk usage for all pods and volumes. To disable volume calculations, set to a negative number. (default 1m0s) (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)
/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
这是前面日志中提到的内容:
[root@test-master ~]# cat /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
我也尝试添加Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true --fail-swap-on=false"
提到的文件这里以及systemctl daemon-reload
但systemctl restart kubelet
它没有帮助。
不幸的是,我不太确定服务器的版本,但它应该和我的一样客户端版本 [v1.23.3]。
[root@test-master ~]# kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.3", GitCommit:"816c97ab8cff8a1c72eccca1026f7820e93e0d25", GitTreeState:"clean", BuildDate:"2022-01-25T21:25:17Z", GoVersion:"go1.17.6", Compiler:"gc", Platform:"linux/amd64"}
The connection to the server 10.17.98.171:6443 was refused - did you specify the right host or port?
有没有什么方法可以挽救我的集群而不需要重置整个集群?我希望我可以访问之前在集群中运行的部署。
更新:
我尝试使用日志查找错误消息journalctl -fu kubelet
,这是我能找到的最接近的东西。
May 31 08:58:12 test-master systemd[1]: kubelet.service holdoff time over, scheduling restart.
May 31 08:58:12 test-master systemd[1]: Stopped kubelet: The Kubernetes Node Agent.
May 31 08:58:12 test-master systemd[1]: Started kubelet: The Kubernetes Node Agent.
May 31 08:58:12 test-master kubelet[5280]: Error: failed to parse kubelet flag: unknown flag: --network-plugin
另外,这是我的kubelet 的版本Kubernetes v1.24.1
。
似乎与 GitHub 上的这个问题有关,https://github.com/kubernetes/website/issues/33640。
答案1
如果您使用 containerd 作为 cri,请确保 cri 不包含在 /etc/containerd/config.toml 中的 disabled_plugins 列表中
答案2
问题是我在 Kubernetes v1.24 之前使用 Docker 作为容器运行时。从 v1.24 开始,Docker 作为容器运行时已被弃用。在我将容器运行时切换到 containerd 后,一切都正常了。
以下是我解决问题的方法:https://kubernetes.io/docs/tasks/administer-cluster/migrating-from-dockershim/change-runtime-containerd/
答案3
我在这里遇到了类似的问题。端口 6443 的另一个问题被拒绝了。
我还尝试运行 swapoff -a、stop 并 retstart kubelet。检查了网络、kubernetes 配置文件等。
#systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: activating (auto-restart) (Result: exit-code) since Wed 2023-05-24 13:07:50 UTC; 7s ago
Docs: https://kubernetes.io/docs/home/
Process: 111682 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=1/FAILURE)
Main PID: 111682 (code=exited, status=1/FAILURE)
#kubectl version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.27.1
Kustomize Version: v5.0.1
The connection to the server k8s-master.dcs2.local:6443 was refused - did you specify the right host or port?