我在 Ubuntu 14.04 工作站上安装了slurm-llnl
。slurm-llnl-slurmdbd
我成功配置了 SLURM,但遇到了slurmdbd
服务和 mySQL 数据库的问题
系统启动后,service --status-all
显示
[ + ] slurm-llnl
[ - ] slurm-llnl-slurmdbd
和/var/log/slurm-llnl/slurmdbd.log
[2015-11-03T14:52:30.179] debug3: Trying to load plugin /usr/lib/slurm/auth_munge.so
[2015-11-03T14:52:30.222] auth plugin for Munge (http://code.google.com/p/munge/) loaded
[2015-11-03T14:52:30.223] debug3: Success.
[2015-11-03T14:52:30.223] debug3: Trying to load plugin /usr/lib/slurm/accounting_storage_mysql.so
[2015-11-03T14:52:30.581] debug2: mysql_connect() called for db slurm_acct_db
[2015-11-03T14:52:30.643] error: mysql_real_connect failed: 2002 Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)
[2015-11-03T14:52:30.643] fatal: The database must be up when starting the MYSQL plugin.
如果我现在手动启动服务
$ sudo service slurm-llnl-slurmdbd start
一切似乎都运行良好:
[2015-11-03T14:54:08.324] debug3: Trying to load plugin /usr/lib/slurm/auth_munge.so
[2015-11-03T14:54:08.324] auth plugin for Munge (http://code.google.com/p/munge/) loaded
[2015-11-03T14:54:08.324] debug3: Success.
[2015-11-03T14:54:08.324] debug3: Trying to load plugin /usr/lib/slurm/accounting_storage_mysql.so
[2015-11-03T14:54:08.326] debug2: mysql_connect() called for db slurm_acct_db
[2015-11-03T14:54:08.367] debug4: (accounting_storage_mysql.c:1069) query
show tables like 'user_table';
[2015-11-03T14:54:08.367] debug4: (accounting_storage_mysql.c:1089) query
show tables like 'localhost_assoc_table';
[2015-11-03T14:54:08.367] debug4: (accounting_storage_mysql.c:1106) query
show columns from "localhost_assoc_table" where Field='is_def';
[2015-11-03T14:54:08.392] debug4: (accounting_storage_mysql.c:1069) query
show tables like 'user_table';
[2015-11-03T14:54:08.392] debug4: (accounting_storage_mysql.c:1089) query
show tables like 'qtech_assoc_table';
[2015-11-03T14:54:08.392] debug4: (accounting_storage_mysql.c:1106) query
show columns from "qtech_assoc_table" where Field='is_def';
[2015-11-03T14:54:08.412] debug4: (as_mysql_convert.c:788) query
show tables like 'assoc_table';
[2015-11-03T14:54:08.412] debug4: (as_mysql_convert.c:829) query
show tables like 'cluster_event_table';
[2015-11-03T14:54:08.412] debug4: (as_mysql_convert.c:852) query
show tables like 'job_table';
[2015-11-03T14:54:08.412] debug4: (as_mysql_convert.c:876) query
show tables like 'last_ran_table';
[2015-11-03T14:54:08.412] debug4: (as_mysql_convert.c:897) query
show tables like 'resv_table';
[2015-11-03T14:54:08.413] debug4: (as_mysql_convert.c:920) query
show tables like 'step_table';
[2015-11-03T14:54:08.413] debug4: (as_mysql_convert.c:942) query
show tables like 'suspend_table';
[2015-11-03T14:54:08.413] debug4: (as_mysql_convert.c:964) query
show tables like 'cluster_hour_usage_table';
[2015-11-03T14:54:08.413] debug4: (as_mysql_convert.c:1004) query
show tables like 'wckey_table';
[2015-11-03T14:54:08.449] Accounting storage MYSQL plugin loaded
[2015-11-03T14:54:08.449] debug3: Success.
...
我不是 Ubuntu 专家用户,但我觉得slurm-llnl-slurmdbd
启动时 mySQL 服务尚未就绪。但是,它在依赖项中正确列出。以下是 的开头/etc/init.d/slurm-llnl-slurmdbd
:
#!/bin/sh
#
# chkconfig: 345 90 10
# description: SLURMDBD is a database server interface for \
# SLURM (Simple Linux Utility for Resource Management).
#
# processname: /usr/sbin/slurmdbd
# pidfile: /var/run/slurm-llnl/slurmdbd.pid
#
# config: /etc/sysconfig/slurm
#
### BEGIN INIT INFO
# Provides: slurm-llnl-slurmdbd
# Required-Start: $remote_fs $syslog $network munge
# Required-Stop: $remote_fs $syslog $network munge
# Should-Start: $named mysql
# Should-Stop: $named mysql
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: SLURM database daemon
# Description: Start slurm to provide database server for SLURM
### END INIT INFO
SBINDIR=/usr/sbin
LIBDIR=/usr/lib
CONFFILE="/etc/slurm-llnl/slurmdbd.conf"
DESCRIPTION="slurm-llnl database server interface"
NAME="slurmdbd"
答案1
遇到了类似的问题,发现mysql是由initctl控制的,而不是由旧的系统V init控制。
你确定 mysqld 正在运行吗?使用“initctl status mysql”检查。
答案2
在 Ubuntu 14.04 Trusty 中,MySQL 是通过 Upstart 单元启动的,但 SLURM 仍然通过 SysV-init 脚本启动,它们都是通过“rc”Upstart 单元启动的。最初,这两者之间没有依赖关系,因此可以并行启动。
在我将 MySQL 单元修改为在 rc 单元之前启动(而不仅仅是在运行级别启动)之后,它似乎在我的设置中可靠地工作,即将 /etc/init/mysql.conf 中的行“start on runlevel [2345]”替换为“start on Starting rc RUNLEVEL=[2345]”。
如果希望自动化,请使用这两行,例如在预置脚本中:
sudo dpkg-divert --local -add /etc/init/mysql.conf
sudo sed -i "s/^\(start on\) runlevel \[2345\]/\1 starting rc RUNLEVEL=[2345]/" /etc/init/mysql.conf