我正在尝试通过以下方式向学校服务器(HPC)提交作业:
#!/bin/bash
#$ -S /bin/bash
#$ -cwd
#$ -o ./out_$JOB_ID.txt
#$ -e ./err_$JOB_ID.txt
#$ -notify
#$ -pe orte 1
date
pwd
##################################
RESULT_DIR=~/Results
SCRIPT_FILE=sample_job
##################################
. /etc/profile
. /etc/bashrc
module load packages/comsol/4.4
module load packages/matlab/r2012b
comsol server matlab "sample_job, exit" -nodesktop -mlnosplash
/bin/uname -a
mkdir $RESULT_DIR/$name
cp *.csv $RESULT_DIR/$name
作业中止说:
Sun Jun 8 14:20:21 EDT 2014
COMSOL 4.4 (Build: 150) started listening on port 2036
Use the console command 'close' to exit the program
/usr/bin/xterm Xt error: Can't open display:
/usr/bin/xterm: DISPLAY is not set
Program_did_not_exit_normally
Exception:
com.comsol.util.exceptions.FlException: Program did not exit normally
Messages:
Program did not exit normally
Stack trace:
at com.comsol.mli.application.a.a(Unknown Source)
at com.comsol.mli.application.MatlabApplication.doStart(Unknown Source)
at com.comsol.util.application.ComsolApplication.doStart(Unknown Source)
at com.comsol.util.application.ComsolApplication.doRun(Unknown Source)
at com.comsol.bridge.Bridge$2.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
ERROR: Could not start COMSOL Application. See log file: /home/.comsol/v44/logs/server2.log
java.lang.IllegalStateException: Shutdown in progress
at java.lang.ApplicationShutdownHooks.add(Unknown Source)
at java.lang.Runtime.addShutdownHook(Unknown Source)
at org.apache.catalina.startup.Catalina.start(Catalina.java:699)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:322)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:451)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at com.comsol.util.application.ServerApplication.a(Unknown Source)
at com.comsol.util.application.ServerApplication.a(Unknown Source)
at com.comsol.util.application.ServerApplication.a(Unknown Source)
at com.comsol.util.application.ServerApplication.main(Unknown Source)
可能是什么原因以及我应该如何解决它?
答案1
我假设您在提交此脚本运行时使用 GridEngine 作为集群软件。像这样的东西:
$ qsub myscript.sh
您可以将环境变量包含到qsub
您希望在 HPC 集群节点上生成的结果 shell 中,如下所示:
$ qsub -v DISPLAY=$(hostname):0.0 myscript.sh
这应该将您正在提交的系统的主机名“注入”为您希望远程显示任何 GUI 的系统。
您可能还需要执行此操作以允许本地系统“接收”此远程显示的窗口。最简单也是最不安全的方法是这样的:
$ xhost +
如果这有效并且您担心使其“更安全”,您可以更明确地表示,xhost +
但这可能没有必要。让我们知道您的想法,如果需要,我们可以进一步调整。
如果上面的方法不起作用怎么办?
现在的较新版本qsub
包含一个开关,-X
据称可以$DISPLAY
正确传递环境变量,如下所示:
$ qsub -X myscript.sh
您还可以尝试使用提交主机的 IP 地址而不是主机名。可能是 HPC 节点没有正确设置 DNS。
$ qsub -v DISPLAY="$(hostname -i):0.0" myscript.sh