我正在玩 docker,希望能够在其中启动一个 mpi 应用程序。
我使用 ubuntu:latest 作为基础映像,并且安装了编译我的程序并将其与 mpi 链接所需的工具。
当我使用 mpirun 启动该程序时,收到以下警告:
[c1dab84c3fac:10417] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file ess_hnp_module.c at line 170
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_plm_base_select failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[c1dab84c3fac:10417] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 128
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems. This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):
orte_ess_set_name failed
--> Returned value Not found (-13) instead of ORTE_SUCCESS
--------------------------------------------------------------------------
[c1dab84c3fac:10417] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file orterun.c at line 694
如果我在普通的 ubuntu(相同版本)中运行相同的程序,它就会运行。
答案1
Docker 容器与普通的 Ubuntu 不同。默认的 Ubuntu 容器缺少很多基本的东西(例如:init、ssh 守护程序、cron)
通常我使用 phusion base-image docker,按照下面的方法了解有关默认 Ubuntu docker 中的一些基本问题以及如何解决这些问题。
答案2
我最近遇到了这个问题。安装 ssh 包将修复此问题。