Openstack 14.04 JUJU 服务在升级后大部分出现故障

Openstack 14.04 JUJU 服务在升级后大部分出现故障

自从我最初构建 OpenStack 云以来已经过去了几个月,我违背自己的判断,sudo apt-get update ; sudo apt-get upgrade在我的节点上运行它。这是一个坏主意。

重启后一切似乎都很正常,nagios 报告服务没有问题,但是当我启动我的实例时,它们都无法获取 IP。因此,当我开始在 neutron 中进行调查时,我在 JUJU 中看到大量错误。我甚至不知道从哪里开始。

当节点升级时,他们询问了我一些配置更改,我选择了(N)进行任何修改。我猜这就是问题所在?

landscape@juju-machine-0-lxc-1:~$ juju status --format=tabular
[Services]            
NAME                  STATUS  EXPOSED CHARM                                           
base-machine          error   false   cs:trusty/ubuntu-6                              
ceilometer            active  false   cs:trusty/ceilometer-171                        
ceilometer-agent              false   cs:trusty/ceilometer-agent-167                  
ceph-mon              active  false   cs:~openstack-charmers-next/trusty/ceph-mon-137 
ceph-osd              active  false   cs:trusty/ceph-osd-169                          
ceph-radosgw          active  false   cs:trusty/ceph-radosgw-173                      
cinder                error   false   cs:trusty/cinder-188                            
glance                active  false   cs:trusty/glance-185                            
keystone              active  false   cs:trusty/keystone-253                          
landscape-client              false   cs:trusty/landscape-client-12                   
mongodb               unknown false   cs:trusty/mongodb-35                            
mysql                 active  false   cs:trusty/percona-cluster-178                   
nagios                unknown false   cs:trusty/nagios-10                             
neutron-api           active  false   cs:trusty/neutron-api-177                       
neutron-gateway       error   false   cs:trusty/neutron-gateway-163                   
neutron-openvswitch           false   cs:trusty/neutron-openvswitch-169               
nova-cloud-controller active  false   cs:trusty/nova-cloud-controller-220             
nova-compute          error   false   cs:trusty/nova-compute-190                      
nrpe                          false   cs:trusty/nrpe-7                                
ntp                           false   cs:trusty/ntp-15                                
ntpmaster             unknown false   cs:trusty/ntpmaster-2                           
openstack-dashboard   active  false   cs:trusty/openstack-dashboard-175               
rabbitmq-server       error   false   cs:trusty/rabbitmq-server-43                    

[Units]                 
ID                      WORKLOAD-STATE AGENT-STATE VERSION MACHINE PORTS                                   PUBLIC-ADDRESS MESSAGE                       
base-machine/0          error          idle        1.25.6  0                                               node01.maas    hook failed: "leader-elected" 
  landscape-client/0    unknown        idle        1.25.6                                                  node01.maas                                  
  ntp/0                 unknown        idle        1.25.6                                                  node01.maas                                  
base-machine/1          unknown        idle        1.25.6  2                                               node02.maas                                  
  landscape-client/9    unknown        idle        1.25.6                                                  node02.maas                                  
  ntp/1                 error          idle        1.25.6                                                  node02.maas    hook failed: "leader-elected" 
base-machine/2          unknown        idle        1.25.6  1                                               node03.maas                                  
  landscape-client/10   unknown        idle        1.25.6                                                  node03.maas                                  
  ntp/2                 unknown        idle        1.25.6                                                  node03.maas                                  
ceilometer/0            active         idle        1.25.6  0/lxc/2 8777/tcp                                10.14.0.47     Unit is ready                 
  landscape-client/5    unknown        idle        1.25.6                                                  10.14.0.47                                   
  nrpe/4                unknown        idle        1.25.6                                                  10.14.0.47                                   
ceph-mon/0              active         idle        1.25.6  0/lxc/4                                         10.14.0.53     Unit is ready and clustered   
  landscape-client/2    unknown        idle        1.25.6                                                  10.14.0.53                                   
  nrpe/1                unknown        idle        1.25.6                                                  10.14.0.53                                   
ceph-mon/1              active         idle        1.25.6  2/lxc/4                                         10.14.0.60     Unit is ready and clustered   
  landscape-client/14   unknown        idle        1.25.6                                                  10.14.0.60                                   
  nrpe/10               unknown        idle        1.25.6                                                  10.14.0.60                                   
ceph-mon/2              active         idle        1.25.6  1/lxc/0                                         10.14.0.62     Unit is ready and clustered   
  landscape-client/19   unknown        idle        1.25.6                                                  10.14.0.62                                   
  nrpe/13               unknown        idle        1.25.6                                                  10.14.0.62                                   
ceph-osd/0              active         idle        1.25.6  0                                               node01.maas    Unit is ready (2 OSD)         
  landscape-client/1    unknown        idle        1.25.6                                                  node01.maas                                  
  nrpe/0                unknown        idle        1.25.6                                                  node01.maas                                  
ceph-osd/1              active         idle        1.25.6  2                                               node02.maas    Unit is ready (5 OSD)         
  landscape-client/11   unknown        idle        1.25.6                                                  node02.maas                                  
  nrpe/8                unknown        idle        1.25.6                                                  node02.maas                                  
ceph-osd/2              active         idle        1.25.6  1                                               node03.maas    Unit is ready (5 OSD)         
  landscape-client/12   unknown        idle        1.25.6                                                  node03.maas                                  
  nrpe/9                error          idle        1.25.6                                                  node03.maas    hook failed: "config-changed" 
ceph-radosgw/0          active         idle        1.25.6  2/lxc/0 80/tcp                                  10.14.0.56     Unit is ready                 
  landscape-client/16   unknown        idle        1.25.6                                                  10.14.0.56                                   
cinder/0                error          idle        1.25.6  1/lxc/2                                         10.14.0.64     hook failed: "update-status"  
  landscape-client/22   unknown        idle        1.25.6                                                  10.14.0.64                                   
  nrpe/16               unknown        idle        1.25.6                                                  10.14.0.64                                   
glance/0                active         idle        1.25.6  0/lxc/5 9292/tcp                                10.14.0.54     Unit is ready                 
  landscape-client/4    unknown        idle        1.25.6                                                  10.14.0.54                                   
  nrpe/3                unknown        idle        1.25.6                                                  10.14.0.54                                   
keystone/0              active         idle        1.25.6  2/lxc/2                                         10.14.0.58     Unit is ready                 
  landscape-client/18   unknown        idle        1.25.6                                                  10.14.0.58                                   
  nrpe/12               unknown        idle        1.25.6                                                  10.14.0.58                                   
mongodb/0               unknown        idle        1.25.6  1/lxc/3 27017/tcp,27019/tcp,27021/tcp,28017/tcp 10.14.0.65                                   
  landscape-client/20   unknown        idle        1.25.6                                                  10.14.0.65                                   
  nrpe/14               unknown        idle        1.25.6                                                  10.14.0.65                                   
mysql/0                 active         idle        1.25.6  0/lxc/1                                         10.14.0.50     Unit is ready                 
  landscape-client/7    unknown        idle        1.25.6                                                  10.14.0.50                                   
  nrpe/6                unknown        idle        1.25.6                                                  10.14.0.50                                   
nagios/0                unknown        idle        1.25.6  2/lxc/3 80/tcp                                  10.14.0.59                                   
  landscape-client/15   unknown        idle        1.25.6                                                  10.14.0.59                                   
neutron-api/0           active         idle        1.25.6  1/lxc/4 9696/tcp                                10.14.0.66     Unit is ready                 
  landscape-client/23   unknown        idle        1.25.6                                                  10.14.0.66                                   
  nrpe/17               unknown        idle        1.25.6                                                  10.14.0.66                                   
neutron-gateway/0       error          idle        1.25.6  0                                               node01.maas    hook failed: "config-changed" 
  landscape-client/6    unknown        idle        1.25.6                                                  node01.maas                                  
  nrpe/5                unknown        idle        1.25.6                                                  node01.maas                                  
nova-cloud-controller/0 active         idle        1.25.6  0/lxc/0 3333/tcp,8773/tcp,8774/tcp,9696/tcp     10.14.0.49     Unit is ready                 
  landscape-client/8    unknown        idle        1.25.6                                                  10.14.0.49                                   
  nrpe/7                unknown        idle        1.25.6                                                  10.14.0.49                                   
nova-compute/0          error          idle        1.25.6  2                                               node02.maas    hook failed: "update-status"  
  ceilometer-agent/0    active         idle        1.25.6                                                  node02.maas    Unit is ready                 
  landscape-client/17   unknown        idle        1.25.6                                                  node02.maas                                  
  neutron-openvswitch/0 active         idle        1.25.6                                                  node02.maas    Unit is ready                 
  nrpe/11               unknown        idle        1.25.6                                                  node02.maas                                  
nova-compute/1          error          idle        1.25.6  1                                               node03.maas    hook failed: "update-status"  
  ceilometer-agent/1    active         idle        1.25.6                                                  node03.maas    Unit is ready                 
  landscape-client/21   unknown        idle        1.25.6                                                  node03.maas                                  
  neutron-openvswitch/1 active         idle        1.25.6                                                  node03.maas    Unit is ready                 
  nrpe/15               unknown        idle        1.25.6                                                  node03.maas                                  
ntpmaster/0             unknown        idle        1.25.6  2/lxc/1 123/udp                                 10.14.0.57                                   
  landscape-client/13   unknown        idle        1.25.6                                                  10.14.0.57                                   
openstack-dashboard/0   active         idle        1.25.6  1/lxc/1 80/tcp,443/tcp                          10.14.0.63     Unit is ready                 
  landscape-client/24   unknown        idle        1.25.6                                                  10.14.0.63                                   
  nrpe/18               unknown        idle        1.25.6                                                  10.14.0.63                                   
rabbitmq-server/0       error          idle        1.25.6  0/lxc/3 5672/tcp                                10.14.0.52     hook failed: "update-status"  
  landscape-client/3    unknown        idle        1.25.6                                                  10.14.0.52                                   
  nrpe/2                unknown        idle        1.25.6                                                  10.14.0.52                                   

[Machines] 
ID         STATE   VERSION DNS         INS-ID                                                         SERIES HARDWARE                          
0          started 1.25.6  node01.maas /MAAS/api/1.0/nodes/node-be8673ca-1d31-11e6-a83b-0015c5efa6ff/ trusty arch=amd64 cpu-cores=8 mem=32768M 
1          started 1.25.6  node03.maas /MAAS/api/1.0/nodes/node-b672c22e-1d31-11e6-82b6-0015c5efa6ff/ trusty arch=amd64 cpu-cores=8 mem=32768M 
2          started 1.25.6  node02.maas /MAAS/api/1.0/nodes/node-ba12aac0-1d31-11e6-89e9-0015c5efa6ff/ trusty arch=amd64 cpu-cores=8 mem=32768M 

答案1

如果 OpenStack 集群发生故障,并且多个 charm unit 显示在一个错误或者受阻状态,运行以下步骤使集群重新启动。

  1. 确保节点之间以及与互联网之间的连通性。
    • 需要网络连接,因为在解析单元时将重新运行钩子。大多数 charm 钩子运行如下命令apt-get 更新这需要互联网连接。
    • 如果 juju 命令卡住,则重新启动 juju 控制器/引导节点或重新启动该节点上的 juju-* 服务。
    • 如果您遇到任何“代理丢失”错误,请重新启动这些节点/容器内的 jujud-unit-charm-name-unit 服务。
  2. 解决处于错误状态的魅力单元。

    • $ juju solved--retry charm-name/unit

      这将重新运行最初失败的钩子。按以下顺序解析魅力单元:

      1. mysql
      2. 基石
      3. rabbitmq 服务器
      4. 头孢菌素
      5. 迅速
      6. nova-cloud-控制器
      7. 煤渣
      8. 一瞥
      9. 中子 API
      10. 中子网关
      11. nova-计算
      12. openstack-仪表板
    • 如果解决单元问题没有帮助,请查看 juju 日志以查看错误是什么,然后尝试手动解决。确保所有单元都处于活动状态。

  3. 确认集群已备份

    • 登录 Horizo​​n 并检查所有服务是否处于活动状态
    • 启动所有 OpenStack 实例并确保卷和网络已正确配置。

相关内容