具有 3 个节点的 Infiniband 结构 - 新手

具有 3 个节点的 Infiniband 结构 - 新手

我正在尝试使用以下方法连接 3 个 HP z840 工作站:

Mellanox ConnectX-3 VPI 40 / 56GbE 双端口 QSFP 适配器 MCX354A-FCBT
Mellanox SX6005 12 端口无阻塞非托管 56Gb/s

要连接的机器描述:oak-rd0-linux(我将从中运行程序并且 opensm 正在运行的主节点)oak-rd1-linux oak-rd2-linux

我已经在卡上安装了最新的 fw,并安装了支持我的卡的最新 mlnx ofed 驱动程序(MLNX_OFED_LINUX-4.9-4.1.7.0-ubuntu20.04-x86_64)。运行 ubuntu 20.04(mlnx_ofed 驱动程序所需的 Linux 5.4.0-26-generic 内核)。

如何安装 MLNX OFED:

sudo touch /etc/apt/sources.list.d/mlnx_ofed.list
sudo nano /etc/apt/sources.list.d/mlnx_ofed.list
deb file:/home/user/infiniband/MLNX_OFED_LINUX-4.9-4.1.7.0-ubuntu20.04-x86_64/DEBS/UPSTREAM_LIBS ./
wget -qO - http://www.mellanox.com/downloads/ofed/RPM-GPG-KEY-Mellanox | sudo apt-key add -
apt-key list
sudo apt-get update
sudo apt-get install mlnx-ofed-all

我还得到了 hpcx-v2.6.0-gcc-MLNX_OFED_LINUX-4.7-1.0.0.1

我使用以下命令以守护进程模式启动 opensm:

/etc/init.d/opensmd 启动

我运行 sudo ibdiagnet 并得到一个干净的摘要(注意:如果没有 sudo 就无法运行 ibdiagnet)

Running: ibdiagnet -r
----------
Load Plugins from:
/usr/share/ibdiagnet2.1.1/plugins/
(You can specify more paths to be looked in with "IBDIAGNET_PLUGINS_PATH" env variable)

Plugin Name Result Comment
libibdiagnet_cable_diag_plugin-2.1.1 Succeeded Plugin loaded
libibdiagnet_phy_diag_plugin-2.1.1 Succeeded Plugin loaded

---------------------------------------------
Discovery
-I- Discovering ... 4 nodes (1 Switches & 3 CA-s) discovered.
-I- Fabric Discover finished successfully

-I- Discovered 4 nodes (1 Switches & 3 CA-s).

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- VS Capability GMP finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- VS Capability SMP finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- VS ExtendedPortInfo finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Port Info Extended finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Switch Info retrieving finished successfully

-I- Duplicated GUIDs detection finished successfully

-I- Duplicated Node Description detection finished successfully

---------------------------------------------
Lids Check
-I- Lids Check finished successfully

---------------------------------------------
Links Check
-I- Links Check finished successfully

---------------------------------------------
Subnet Manager
-I- SM Info retrieving finished successfully

-I- Subnet Manager Check finished successfully

---------------------------------------------
Port Counters
-I- Retrieving PMClassPortInfo ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Retrieving PMPortSampleControl ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Ports counters retrieving finished successfully

-I- Going to sleep for 1 seconds until next counters sample
-I- Time left to sleep ... 1 seconds.

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Ports counters retrieving (second time) finished successfully

-I- Ports counters value Check finished successfully

-I- Ports counters Difference Check (during run) finished successfully

---------------------------------------------
Nodes Information
-I- Devid: 4099(0x1003), PSID: MT_1090120019, Latest FW Version:2.42.5000
-I- Devid: 51000(0xc738), PSID: EMC1260110021, Latest FW Version:9.3.8000
-I- FW Check finished successfully

---------------------------------------------
Speed / Width checks
-I- Link Speed Check (Compare to supported link speed)
-I- Links Speed Check finished successfully

-I- Link Width Check (Compare to supported link width)
-I- Links Width Check finished successfully

---------------------------------------------
Alias GUIDs
-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Alias GUIDs retrieving finished successfully

-I- Alias GUIDs finished successfully

---------------------------------------------
Virtualization
-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Virtualization finished successfully

-I- Virtual ports retrieving finished successfully

-I- Virtual ports retrieving finished successfully

---------------------------------------------
Partition Keys
-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Partition Keys retrieving finished successfully

-I- Partition Keys finished successfully

---------------------------------------------
Temperature Sensing
-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Temperature Sensing finished successfully

---------------------------------------------
Routing

-I- EXT switch info retrieving finished successfully

-I- PLFT is enabled on 0 switches.
-I- PLFT data retrieving finished successfully

-I- Adaptive Routing is enabled on 0 switches.
-I- AR data retrieving finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Unicast FDBS Info retrieving finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Multicast FDBS Info retrieving finished successfully

-I- Retrieving ... 4/4 nodes (1/1 Switches & 3/3 CA-s) retrieved.
-I- Dump SLVL Table finished successfully

-I- Load SLVL file.
---------------------------------------------
Summary
-I- Stage Warnings Errors Comment
-I- Discovery 0 0
-I- Lids Check 0 0
-I- Links Check 0 0
-I- Subnet Manager 0 0
-I- Port Counters 0 0
-I- Nodes Information 0 0
-I- Speed / Width checks 0 0
-I- Alias GUIDs 0 0
-I- Virtualization 0 0
-I- Partition Keys 0 0
-I- Temperature Sensing 0 0
-I- Routing 0 0

-I- You can find detailed errors/warnings in: /var/tmp/ibdiagnet2/ibdiagnet2.log


-I- ibdiagnet database file : /var/tmp/ibdiagnet2/ibdiagnet2.db_csv
-I- LST file : /var/tmp/ibdiagnet2/ibdiagnet2.lst
-I- Network dump file : /var/tmp/ibdiagnet2/ibdiagnet2.net_dump
-I- Subnet Manager file : /var/tmp/ibdiagnet2/ibdiagnet2.sm
-I- Ports Counters file : /var/tmp/ibdiagnet2/ibdiagnet2.pm
-I- Nodes Information file : /var/tmp/ibdiagnet2/ibdiagnet2.nodes_info
-I- Alias guids file : /var/tmp/ibdiagnet2/ibdiagnet2.aguid
-I- VPorts file : /var/tmp/ibdiagnet2/ibdiagnet2.vports
-I- VPorts Pkey file : /var/tmp/ibdiagnet2/ibdiagnet2.vports_pkey
-I- Partition keys file : /var/tmp/ibdiagnet2/ibdiagnet2.pkey
-I- VL2VL file : /var/tmp/ibdiagnet2/ibdiagnet2.vl2vl
-I- PLFT file : /var/tmp/ibdiagnet2/ibdiagnet2.plft
-I- AR file : /var/tmp/ibdiagnet2/ibdiagnet2.ar
-I- Full AR file : /var/tmp/ibdiagnet2/ibdiagnet2.far
-I- Unicast FDBS file : /var/tmp/ibdiagnet2/ibdiagnet2.fdbs
-I- Multicast FDBS file : /var/tmp/ibdiagnet2/ibdiagnet2.mcfdbs
-I- SLVL Table file : /var/tmp/ibdiagnet2/ibdiagnet2.slvl

ibping 似乎运行良好,尽管我不确定这些是否是良好的性能值

ibstat | egrep "Port|Base"

(base) baird@oak-rd0-linux:~$ ibstat | egrep "Port|Base"
Port 1:
Base lid: 0
Port GUID: 0x0010e00001885689
Port 2:
Base lid: 1
Port GUID: 0x0010e0000188568a

server ( oak-rd0-linux )
ibping -S -P 2 -d (I know that Port2 is the active one)

I can then ibping from host1 and host2 with:
ibping -P 1 1

Host1 ( oak-rd1-linux )
baird@oak-rd1-linux:~$ sudo ibping -P 1 1
Pong from oak-rd0-linux.(none) (Lid 1): time 0.027 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.037 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.042 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.044 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.038 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.029 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.042 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.029 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.038 ms
^C
--- oak-rd0-linux.(none) (Lid 1) ibping statistics ---
9 packets transmitted, 9 received, 0% packet loss, time 8028 ms
rtt min/avg/max = 0.027/0.036/0.044 ms

Host2 ( oak-rd2-linux )
(base) baird@oak-rd2-linux:~$ sudo ibping -P 1 1
Pong from oak-rd0-linux.(none) (Lid 1): time 0.029 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.015 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.041 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.043 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.044 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.037 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.042 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.039 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.038 ms
Pong from oak-rd0-linux.(none) (Lid 1): time 0.040 ms
^C
--- oak-rd0-linux.(none) (Lid 1) ibping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9055 ms
rtt min/avg/max = 0.015/0.036/0.044 ms

似乎工作正常,假设无限带宽方面一切正常,这是我的问题:

我可以在 hpcx 附带的 ompi 中运行测试

mpicc $HPCX_MPI_TESTS_DIR/examples/hello_c.c -o $HPCX_MPI_TESTS_DIR/examples/hello_c
mpirun -np 2 $HPCX_MPI_TESTS_DIR/examples/hello_c

Hello, world, I am 1 of 2, (Open MPI v4.0.3rc4, package: Open MPI root@0e5a40994726 Distribution, ident: 4.0.3rc4, repo rev: v4.0.3rc4-6-g8b4a8cd34c, Unreleased developer copy, 148)
Hello, world, I am 0 of 2, (Open MPI v4.0.3rc4, package: Open MPI root@0e5a40994726 Distribution, ident: 4.0.3rc4, repo rev: v4.0.3rc4-6-g8b4a8cd34c, Unreleased developer copy, 148)

但是,当我尝试运行时:

mpirun -x LD_LIBRARY_PATH -np 2 -H oak-rd0-linux,oak-rd1-linux $HPCX_MPI_TESTS_DIR/examples/hello_c

我没有收到任何反馈,没有错误,没有输出,它似乎挂了。

-有人可以指导我如何连接/使用其他主机的 CPU 吗? -我需要使用哪些实用程序来调试该问题?

我在这个方面完全是新手,我将非常感谢任何帮助/建议等。我准备提供任何其他信息、测试建议等。干杯!

相关内容