Ubuntu Server 16.04.2 LTS 重启后 juju Kubernetes Core 安装程序未启动

Ubuntu Server 16.04.2 LTS 重启后 juju Kubernetes Core 安装程序未启动

我正在尝试在服务器(48 核和 65 GB 内存)上设置 Kubernetes 系统。我认为 conjure-up/juju 是可行的方法,它会安装服务并启动它们,但当我重新启动服务器时,只有部分服务再次启动,其他服务处于等待状态。

为什么服务无法启动?每次我重新启动时,“消息”部分都会显示几条不同的消息。下面的转储是在服务器启动至少 12 小时后进行的,因此它似乎无法自行修复此问题。

我究竟做错了什么?

$ juju status
Model                        Controller                Cloud/Region         Version
conjure-kubernetes-core-da5  conjure-up-localhost-989  localhost/localhost  2.1.3

App                Version  Status   Scale  Charm              Store       Rev  OS      Notes
easyrsa            3.0.1    active       1  easyrsa            jujucharms    9  ubuntu  
etcd               2.3.8    active       1  etcd               jujucharms   34  ubuntu  
flannel            0.7.0    waiting      2  flannel            jujucharms   15  ubuntu  
kubernetes-master  1.6.2    waiting      1  kubernetes-master  jujucharms   19  ubuntu  exposed
kubernetes-worker  1.6.2    active       1  kubernetes-worker  jujucharms   23  ubuntu  exposed

Unit                  Workload  Agent  Machine  Public address  Ports           Message
easyrsa/0*            active    idle   0        10.0.8.11                       Certificate Authority connected.
etcd/0*               active    idle   1        10.0.8.69       2379/tcp        Errored with 0 known peers
kubernetes-master/0*  waiting   idle   2        10.0.8.131      6443/tcp        Waiting to retry addon deployment
  flannel/0           waiting   idle            10.0.8.131                      Waiting for Flannel
kubernetes-worker/0*  active    idle   3        10.0.8.115      80/tcp,443/tcp  Kubernetes worker running.
  flannel/1*          waiting   idle            10.0.8.115                      Waiting for Flannel

Machine  State    DNS         Inst id        Series  AZ
0        started  10.0.8.11   juju-36585e-0  xenial  
1        started  10.0.8.69   juju-36585e-1  xenial  
2        started  10.0.8.131  juju-36585e-2  xenial  
3        started  10.0.8.115  juju-36585e-3  xenial  

Relation      Provides           Consumes           Type
certificates  easyrsa            etcd               regular
certificates  easyrsa            kubernetes-master  regular
certificates  easyrsa            kubernetes-worker  regular
cluster       etcd               etcd               peer
etcd          etcd               flannel            regular
etcd          etcd               kubernetes-master  regular
cni           flannel            kubernetes-master  regular
cni           flannel            kubernetes-worker  regular
cni           kubernetes-master  flannel            subordinate
kube-control  kubernetes-master  kubernetes-worker  regular
cni           kubernetes-worker  flannel            subordinate

答案1

似乎etcd在启动时崩溃,并且snapd无法加载某些依赖项。此 GitHub 问题建议执行以下操作:

juju run --application etcd 'service snap.etcd.etcd restart'
juju run --application kubernetes-master 'service snap.kube-apiserver.daemon restart'
juju run --application kubernetes-master 'service snap.kube-controller-manager.daemon restart'
juju run --application kubernetes-master 'service snap.kube-scheduler.daemon restart'
juju run --application kubernetes-worker 'service snap.kubelet.daemon restart'
juju run --application kubernetes-worker 'service snap.kube-proxy.daemon restart'

对于我来说,使用基本 Kubernetes charm 和“Canonical”变体都是有效的。

相关内容