为什么 upstart 无法使用未知作业:juju-ubuntu-0 启动我的 juju 服务?

为什么 upstart 无法使用未知作业:juju-ubuntu-0 启动我的 juju 服务?

我在通过常规 juju deploy 命令在 lxc 容器上启动 juju 时遇到了问题:

juju bootstrap
juju deploy ubuntu

ubuntu juju 状态不断显示待办的。LXC 启动了,如果我掌握了 IP 地址,我甚至可以通过 ssh 进入它。但是,juju 服务无法在其中启动。经过一些调试,我发现 upstart 作业无法启动。

到底是怎么回事?

upstart 服务的输出:

root@dpb-local-ubuntu-0:/home/ubuntu# ll /etc/init/juju*
-rw-r--r-- 1 root root 495 Feb  1 18:41 /etc/init/juju-ubuntu-0.conf
root@dpb-local-ubuntu-0:/home/ubuntu# service juju-ubuntu-0 start
start: Unknown job: juju-ubuntu-0
root@dpb-local-ubuntu-0:/home/ubuntu#

手动启动脚本有效(它甚至可以正确更新 juju 状态):

root@dpb-local-ubuntu-0:/home/ubuntu# export JUJU_ZOOKEEPER="10.0.3.1:39983"
root@dpb-local-ubuntu-0:/home/ubuntu# export JUJU_UNIT_NAME="ubuntu/0"
root@dpb-local-ubuntu-0:/home/ubuntu#  export JUJU_MACHINE_ID="None"
root@dpb-local-ubuntu-0:/home/ubuntu# export JUJU_HOME="/var/lib/juju"
root@dpb-local-ubuntu-0:/home/ubuntu# export JUJU_ENV_UUID="None"
root@dpb-local-ubuntu-0:/home/ubuntu# /usr/bin/python -m juju.agents.unit --nodaemon --logfile /var/log/juju/unit-ubuntu-0.log --session-file /var/run/juju/unit-ubuntu-0-agent.zksession
^C

以下是 upstart 的配置文件:

description "Juju unit agent for ubuntu/0"
author "Juju Team <[email protected]>"

start on runlevel [2345]
stop on runlevel [!2345]
respawn

env JUJU_ENV_UUID="None"
env JUJU_HOME="/var/lib/juju"
env JUJU_MACHINE_ID="None"
env JUJU_UNIT_NAME="ubuntu/0"
env JUJU_ZOOKEEPER="10.0.3.1:39983"

exec /usr/bin/python -m juju.agents.unit --nodaemon --logfile /var/log/juju/unit-ubuntu-0.log --session-file /var/run/juju/unit-ubuntu-0-agent.zksession >> /var/log/juju/unit-ubuntu-0-output.log 2>&1

以下内容来自 /var/log/cloud-init-output.log

Setting up juju (0.6.0.1+bzr608-0juju2~precise1) ...
Processing triggers for libc-bin ...
ldconfig deferred processing now taking place
Processing triggers for python-support ...
start: Unknown job: juju-ubuntu-0
failed: /var/lib/cloud/instance/scripts/runcmd [1]
2013-02-01 18:41:37,603 - cc_scripts_user.py[WARNING]: failed to run-parts in /var/lib/cloud/instance/scripts
2013-02-01 18:41:37,604 - __init__.py[WARNING]: Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/cloudinit/CloudConfig/__init__.py", line 117, in run_cc_modules
    cc.handle(name, run_args, freq=freq)
  File "/usr/lib/python2.7/dist-packages/cloudinit/CloudConfig/__init__.py", line 78, in handle
    [name, self.cfg, self.cloud, cloudinit.log, args])
  File "/usr/lib/python2.7/dist-packages/cloudinit/__init__.py", line 326, in sem_and_run
    func(*args)
  File "/usr/lib/python2.7/dist-packages/cloudinit/CloudConfig/cc_scripts_user.py", line 31, in handle
    util.runparts(runparts_path)
  File "/usr/lib/python2.7/dist-packages/cloudinit/util.py", line 227, in runparts
    raise RuntimeError('runparts: %i failures' % failed)
RuntimeError: runparts: 1 failures

2013-02-01 18:41:37,605 - __init__.py[ERROR]: config handling of scripts-user, None, [] failed

完整控制台日志供参考:http://pastebin.ubuntu.com/1598256/

要求的输出:

root@dpb-local-ubuntu-0:/var/log# date
Mon Feb  4 15:41:38 UTC 2013
root@dpb-local-ubuntu-0:/var/log# sudo start juju-ubuntu-0
start: Unknown job: juju-ubuntu-0
root@dpb-local-ubuntu-0:/var/log# ll /var/log/init
ls: cannot access /var/log/init: No such file or directory
root@dpb-local-ubuntu-0:/var/log# dmesg |grep init
[    0.000000] initial memory mapped: [mem 0x00000000-0x1fffffff]
[    0.000000] init_memory_mapping: [mem 0x00000000-0xbf7fffff]
[    0.000000] init_memory_mapping: [mem 0x100000000-0x33f7fffff]
[    0.000000] Memory: 12261276k/13623296k available (6718k kernel code, 1060604k absent, 301416k reserved, 6452k data, 932k init)
[    0.000124] Security Framework initialized
[    0.000167] AppArmor: AppArmor initialized
[    0.263902] devtmpfs: initialized
[    0.265782] Trying to unpack rootfs image as initramfs...
[    0.322530] SCSI subsystem initialized
[    0.335211] pnp: PnP ACPI init
[    0.599361] audit: initializing netlink socket (disabled)
[    0.599408] type=2000 audit(1359415390.492:1): initialized
[    0.616085] fuse init (API version 7.19)
[    0.723931] Freeing initrd memory: 21968k freed
[    0.816927] device-mapper: ioctl: 4.22.0-ioctl (2011-10-19) initialised: [email protected]
[    0.817336] ashmem: initialized
[    4.468141] device-mapper: dm-raid45: initialized v0.2594b
[    4.472486] async_tx: api initialized (async)
[   26.988510] init: eucalyptus-network (lo) main process (1170) killed by TERM signal
[   27.437140] init: failsafe main process (1166) killed by TERM signal
[   27.668253] Bluetooth: HCI device and connection manager initialized
[   27.668254] Bluetooth: HCI socket layer initialized
[   27.668254] Bluetooth: L2CAP socket layer initialized
[   27.668264] Bluetooth: SCO socket layer initialized
[   28.105061] Bluetooth: RFCOMM TTY layer initialized
[   28.105065] Bluetooth: RFCOMM socket layer initialized
[   56.880220] init: udev-fallback-graphics main process (2295) terminated with status 1
[   56.953337] init: gdm main process (2333) killed by TERM signal
[  128.387028] init: plymouth-stop pre-start process (4158) terminated with status 1

进一步调试:

root@dpb-local-ubuntu-2:~# ps -ef |grep juju
root     11444 11381  0 16:42 pts/0    00:00:00 grep --color=auto juju
root@dpb-local-ubuntu-2:~# initctl start juju-ubuntu-2
initctl: Unknown job: juju-ubuntu-2
root@dpb-local-ubuntu-2:~# initctl reload-configuration
root@dpb-local-ubuntu-2:~# initctl start juju-ubuntu-2
juju-ubuntu-2 start/running, process 11448
root@dpb-local-ubuntu-2:~#

答案1

问题最终是主机的 inotify 资源不足,导致 upstart 无法监控本地文件系统中的新 init 脚本。在我的系统上,我认为 crashplan(备份软件)是问题根源。

增加专用于 inotify 的资源量解决了该问题。如下所示:

echo 1048576 > /proc/sys/fs/inotify/max_user_watches

# or in /etc/sysctl.conf to make it survive reboots:
fs.inotify.max_user_watches=1048576

以下是一些很好的参考链接:

感谢所有帮助回答这个问题的人!

相关内容