Rabbitmq 的 Juju 部署在 LXD 上停留在待处理状态

Rabbitmq 的 Juju 部署在 LXD 上停留在待处理状态

我是 juju 和 lxd 的新手。尝试使用 JUJU 在 LXD 上安装 RabbitMQ。安装的 LXD 和容器按预期运行。

$ juju --version
2.9.42-ubuntu-amd64

按照以下步骤操作

自举控制器

juju bootstrap localhost rabbitmq

此后,LXD 中有一个容器,用于lxc list

创建模型

juju add-model messaging

部署 RabbitMQ

$ juju deploy rabbitmq-server --debug -n3 --config min-cluster-size=3 rabbitmq
20:11:56 INFO  juju.cmd supercommand.go:56 running juju [2.9.42 7b871e782195bdac9c90f8a8f01723cc3e08ab92 gc go1.18.10]
20:11:56 DEBUG juju.cmd supercommand.go:57   args: []string{"/snap/juju/22345/bin/juju", "deploy", "rabbitmq-server", "--debug", "-n3", "--config", "min-cluster-size=3", "rabbitmq"}
20:11:56 INFO  juju.juju api.go:86 connecting to API addresses: [10.37.130.2:17070]
20:11:56 DEBUG juju.api apiclient.go:1152 successfully dialed "wss://10.37.130.2:17070/api"
20:11:56 INFO  juju.api apiclient.go:687 connection established to "wss://10.37.130.2:17070/api"
20:11:56 INFO  juju.juju api.go:86 connecting to API addresses: [10.37.130.2:17070]
20:11:56 DEBUG juju.api apiclient.go:1152 successfully dialed "wss://10.37.130.2:17070/model/b3ce1455-e650-461d-8a69-6496b6ce573e/api"
20:11:56 INFO  juju.api apiclient.go:687 connection established to "wss://10.37.130.2:17070/model/b3ce1455-e650-461d-8a69-6496b6ce573e/api"
20:11:56 DEBUG juju.cmd.juju.application.deployer deployer.go:396 cannot interpret as local charm: file does not exist
20:11:56 DEBUG juju.cmd.juju.application.deployer deployer.go:208 cannot interpret as a redeployment of a local charm from the controller
20:12:00 DEBUG juju.cmd.juju.application.store charmadapter.go:142 cannot interpret as charmstore bundle: xenial (series) != "bundle"
20:12:00 INFO  cmd charm.go:452 Preparing to deploy "rabbitmq-server" from the charmhub
20:12:15 INFO  cmd charm.go:550 Located charm "rabbitmq-server" in charm-hub, revision 123
20:12:15 INFO  cmd charm.go:236 Deploying "rabbitmq" from charm-hub charm "rabbitmq-server", revision 123 in channel stable on xenial
20:12:16 DEBUG juju.api monitor.go:35 RPC connection died
20:12:16 DEBUG juju.api monitor.go:35 RPC connection died
20:12:16 INFO  cmd supercommand.go:544 command finished

LXD 容器的状态

这 3 个没有 IP 的容器,当它们出现时lxc list,它们处于停止状态。一段时间后,它们进入运行状态。

$ lxc list
+---------------+---------+--------------------+------+-----------+-----------+
|     NAME      |  STATE  |        IPV4        | IPV6 |   TYPE    | SNAPSHOTS |
+---------------+---------+--------------------+------+-----------+-----------+
| juju-ce573e-0 | RUNNING |                    |      | CONTAINER | 0         |
+---------------+---------+--------------------+------+-----------+-----------+
| juju-ce573e-1 | RUNNING |                    |      | CONTAINER | 0         |
+---------------+---------+--------------------+------+-----------+-----------+
| juju-ce573e-2 | RUNNING |                    |      | CONTAINER | 0         |
+---------------+---------+--------------------+------+-----------+-----------+
| juju-dac435-0 | RUNNING | 10.37.130.2 (eth0) |      | CONTAINER | 0         |
+---------------+---------+--------------------+------+-----------+-----------+

符咒状态

$ juju status
Model      Controller  Cloud/Region         Version  SLA          Timestamp
messaging  rabbit1     localhost/localhost  2.9.42   unsupported  20:33:35+05:30

App       Version  Status   Scale  Charm            Channel  Rev  Exposed  Message
rabbitmq           waiting    0/3  rabbitmq-server  stable   123  no       waiting for machine

Unit        Workload  Agent       Machine  Public address  Ports  Message
rabbitmq/0  waiting   allocating  0                               waiting for machine
rabbitmq/1  waiting   allocating  1                               waiting for machine
rabbitmq/2  waiting   allocating  2                               waiting for machine

Machine  State    Address  Inst id        Series  AZ  Message
0        pending           juju-ce573e-0  xenial      Running
1        pending           juju-ce573e-1  xenial      Running
2        pending           juju-ce573e-2  xenial      Running

调试日志

$ juju debug-log --replay
controller-0: 20:09:51 INFO juju.worker.apicaller [b3ce14] "machine-0" successfully connected to "localhost:17070"
controller-0: 20:09:51 INFO juju.worker.logforwarder config change - log forwarding not enabled
controller-0: 20:09:51 INFO juju.worker.logger logger worker started
controller-0: 20:09:51 INFO juju.worker.machineundertaker setting up machine undertaker
controller-0: 20:09:51 INFO juju.worker.pruner.action status history config: max age: 336h0m0s, max collection size 5120M for messaging (b3ce1455-e650-461d-8a69-6496b6ce573e)
controller-0: 20:09:51 INFO juju.worker.pruner.statushistory status history config: max age: 336h0m0s, max collection size 5120M for messaging (b3ce1455-e650-461d-8a69-6496b6ce573e)
controller-0: 20:09:51 INFO juju.worker.provisioner entering provisioner task loop; using provisioner pool with 16 workers
controller-0: 20:09:51 INFO juju.worker.provisioner provisioning in zones: [pskp]
controller-0: 20:12:18 INFO juju.worker.provisioner provisioning in zones: [pskp]
controller-0: 20:12:18 INFO juju.worker.provisioner found machine pending provisioning id:0, details:0
controller-0: 20:12:23 INFO juju.worker.provisioner provisioning in zones: [pskp]
controller-0: 20:12:23 INFO juju.worker.provisioner found machine pending provisioning id:1, details:1
controller-0: 20:12:23 INFO juju.worker.provisioner trying machine 0 StartInstance in availability zone pskp
controller-0: 20:12:23 INFO juju.worker.provisioner provisioning in zones: [pskp]
controller-0: 20:12:23 INFO juju.worker.provisioner found machine pending provisioning id:2, details:2
controller-0: 20:12:23 INFO juju.worker.provisioner trying machine 1 StartInstance in availability zone pskp
controller-0: 20:12:23 INFO juju.worker.provisioner trying machine 2 StartInstance in availability zone pskp
controller-0: 20:14:42 INFO juju.worker.provisioner started machine 2 as instance juju-ce573e-2 with hardware "arch=amd64 cores=0 mem=0M", network config [], volumes [], volume attachments map[], subnets to zones [], lxd profiles []
controller-0: 20:14:42 INFO juju.worker.provisioner started machine 0 as instance juju-ce573e-0 with hardware "arch=amd64 cores=0 mem=0M", network config [], volumes [], volume attachments map[], subnets to zones [], lxd profiles []
controller-0: 20:14:42 INFO juju.worker.provisioner started machine 1 as instance juju-ce573e-1 with hardware "arch=amd64 cores=0 mem=0M", network config [], volumes [], volume attachments map[], subnets to zones [], lxd profiles []
controller-0: 20:14:46 INFO juju.worker.instancemutater.environ no changes necessary to machine-1 lxd profiles ([default juju-messaging])
controller-0: 20:14:46 INFO juju.worker.instancemutater.environ no changes necessary to machine-0 lxd profiles ([default juju-messaging])
controller-0: 20:14:46 INFO juju.worker.instancemutater.environ no changes necessary to machine-2 lxd profiles ([default juju-messaging])
controller-0: 20:15:05 INFO juju.worker.instancepoller machine "1" (instance ID "juju-ce573e-1") instance status changed from {"running" "Container started"} to {"running" "Running"}
controller-0: 20:15:05 INFO juju.worker.instancepoller machine "2" (instance ID "juju-ce573e-2") instance status changed from {"running" "Container started"} to {"running" "Running"}
controller-0: 20:15:05 INFO juju.worker.instancepoller machine "0" (instance ID "juju-ce573e-0") instance status changed from {"running" "Container started"} to {"running" "Running"}

多个问题

  1. 部署是否卡住了?
  2. 如何检查控制器日志?
  3. 我如何检查 juju 代理日志?
  4. 没有 3 个副本 Rabbitmq 容器的 IP,这是预期情况吗?

編輯

另一个观察结果是,在为 rabbitmq 创建的容器内,dns 解析失败,而控制器容器运行正常。

在此处输入图片描述

相关内容