在 Ubuntu 18.04 LTS 上安装 Kubernetes(使用 Docker)-初始化失败

在 Ubuntu 18.04 LTS 上安装 Kubernetes(使用 Docker)-初始化失败

我尝试在运行 Ubuntu 10.04 LTS 的虚拟机上安装 Kubernetes,在尝试初始化系统时遇到问题,kubeadm init 命令导致失败(完整日志如下)。

VM:2 个 CPU、512mb RAM、100gb 磁盘,在 VMWare ESXi6 下运行。

操作系统:Ubuntu 18.04 LTS 服务器安装,在开始 Docker 和 Kubernetes 安装之前通过 apt update 和 apt upgrade 进行完全更新。

Docker 按照此处的说明进行安装,安装完成并且没有错误: https://kubernetes.io/docs/setup/production-environment/container-runtimes/#docker

Kubernetes 按照此处的说明进行安装,Docker 部分除外(因为按照这些说明会产生有关 systemd/cgroupfs 的 PreFlight 错误):https://vitux.com/install-and-deploy-kubernetes-on-ubuntu/

所有安装似乎都顺利进行,没有报告错误,但是尝试启动 Kubernetes 失败,如下面的日志所示。

我对 Docker 和 Kubernetes 都完全陌生,尽管我了解主要概念,并尝试过 kubernetes.io 上的在线教程,但除非我能安装一个可用的系统,否则我无法进一步学习。当 kubeadm 尝试启动集群时,一切都会挂起四分钟,然后超时退出,如下所示。

root@k8s-master-dev:~# sudo kubeadm init --pod-network-cidr=10.244.0.0/16
[init] Using Kubernetes version: v1.15.3
[preflight] Running pre-flight checks
[preflight] Pulling images required for setting up a Kubernetes cluster
[preflight] This might take a minute or two, depending on the speed of your internet connection
[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet-start] Activating the kubelet service
[certs] Using certificateDir folder "/etc/kubernetes/pki"
[certs] Generating "ca" certificate and key
[certs] Generating "apiserver-kubelet-client" certificate and key
[certs] Generating "apiserver" certificate and key
[certs] apiserver serving cert is signed for DNS names [k8s-master-dev kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 10.24.0.100]
[certs] Generating "front-proxy-ca" certificate and key
[certs] Generating "front-proxy-client" certificate and key
[certs] Generating "etcd/ca" certificate and key
[certs] Generating "etcd/server" certificate and key
[certs] etcd/server serving cert is signed for DNS names [k8s-master-dev localhost] and IPs [10.24.0.100 127.0.0.1 ::1]
[certs] Generating "etcd/peer" certificate and key
[certs] etcd/peer serving cert is signed for DNS names [k8s-master-dev localhost] and IPs [10.24.0.100 127.0.0.1 ::1]
[certs] Generating "etcd/healthcheck-client" certificate and key
[certs] Generating "apiserver-etcd-client" certificate and key
[certs] Generating "sa" key and public key
[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
[kubeconfig] Writing "admin.conf" kubeconfig file
[kubeconfig] Writing "kubelet.conf" kubeconfig file
[kubeconfig] Writing "controller-manager.conf" kubeconfig file
[kubeconfig] Writing "scheduler.conf" kubeconfig file
[control-plane] Using manifest folder "/etc/kubernetes/manifests"
[control-plane] Creating static Pod manifest for "kube-apiserver"
[control-plane] Creating static Pod manifest for "kube-controller-manager"
[control-plane] Creating static Pod manifest for "kube-scheduler"
[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
[kubelet-check] Initial timeout of 40s passed.

Unfortunately, an error has occurred:
        timed out waiting for the condition

This error is likely caused by:
        - The kubelet is not running
        - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)

If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
        - 'systemctl status kubelet'
        - 'journalctl -xeu kubelet'

Additionally, a control plane component may have crashed or exited when started by the container runtime.
To troubleshoot, list all containers using your preferred container runtimes CLI, e.g. docker.
Here is one example how you may list all Kubernetes containers running in docker:
        - 'docker ps -a | grep kube | grep -v pause'
        Once you have found the failing container, you can inspect its logs with:
        - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

我查看了日志日志数据和 docker 日志,但除了大量超时之外,找不到任何可以解释实际错误的内容。有人能建议我应该查看哪里吗,以及最有可能是什么原因导致的?

已尝试过的操作:删除所有 IPTables 规则并将默认值设置为“接受”。按照 vitux.com 说明使用 Docker 安装运行(给出 PreFlight 警告但没有错误,但尝试初始化 Kubernetes 时超时相同)。

更新:根据@Crou 的评论,如果我以 root 身份尝试“kubeadm init”,则会发生以下情况:

root@k8s-master-dev:~# uptime
 16:34:49 up  7:23,  3 users,  load average: 10.55, 16.77, 19.31
root@k8s-master-dev:~# kubeadm init
[init] Using Kubernetes version: v1.15.3
[preflight] Running pre-flight checks
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR Port-6443]: Port 6443 is in use
        [ERROR Port-10251]: Port 10251 is in use
        [ERROR Port-10252]: Port 10252 is in use
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
        [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
        [ERROR Port-10250]: Port 10250 is in use
        [ERROR Port-2379]: Port 2379 is in use
        [ERROR Port-2380]: Port 2380 is in use
        [ERROR DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

关于正常运行时间显示的非常高的负载,该负载在首次尝试初始化时立即启动,并且负载保持非常高,除非进行 kibeadm 重置以清除所有内容。

答案1

进一步的实验:除了 VMware ESXi 虚拟机管理程序外,我们还在相同的硬件平台上运行 XenServer 虚拟机管理程序。我尝试在其中一个 Xen 刀片上的 VM 上进行相同的安装,但事实证明根本无法安装 Ubuntu,安装在“安装内核”阶段失败。我尝试了两次单独的安装,两次都在同一位置失败。VM 规格与 ESXi 下的相同,2 个核心,512MB 内存,100 GB 硬盘。

解决方案:我们最终解决了这个问题,放弃了虚拟机,直接在硬件上安装 Ubuntu 18.04,没有涉及虚拟机管理程序或虚拟机,然后添加了 Docker 和 Kubernetes,这一次 kubeadm init 命令正确完成,并出现了预期的消息。我们安装的刀片的规格是 2 个六核 Xeon 处理器和 48 GB 内存,以及 1TB 硬盘。

相关内容