Infiniband P_Keys 和 Linux 内核

Infiniband P_Keys 和 Linux 内核

我正在尝试在两个基于 Debian 的 Linux 主机(运行 4.15 内核)和 Mellanox SX6036 交换机之间配置 Infiniband 分区。我使用 PKey 在交换机上设置了一个“DMZ”分区,0x0001并添加了Port GUID两个 Linux 主机(恰好ib1在两个主机上)的活动 IB 连接中的数字。

在此处输入图片描述

据我所知这里, 和这里我现在echo PKEY_VALUE > /sys/class/net/ib1/create_child在两台主机上运行,​​应该会得到一个名为 的新接口ib1.PKEY_VALUE。然后,我可以为新接口分配一个私有 IP 地址,并在 MY_PKEY 分区的成员主机之间进行通信。它应该这样工作吗?

在 kernel.org 链接的示例中,他们使用了0x8001在 Linux 端运行良好的示例,并创建了一个名为 的接口ib1.8001。但是,Mellanox 交换机不允许我将 PKey 设置为该值。我收到一个错误:Pkey 0x8001 无效。值必须介于 0x1 和 0x7fff 之间。我尝试过在交换机上使用不同的 PKey 值(例如0x0001),但 Linux 总是会创建以 为前缀的接口,而0x8... 我无法将其用作交换机上的 PKey。我是不是误解了什么?


更新:希望一些额外的信息能有所帮助。列出 host1 和 host2 的链接信息以及输出ibnodes(在两个主机上产生相同的输出)、ibstat(除 GUID 外输出相同)和ibdiagnet。当我为接口分配 IP 时ib.8001dmesghost1 上显示以下内容:ib1.8001: P_Key 0x8001 is not found

我正在添加当前分区的新屏幕截图,因为我已将其更改为包含满的所有 Port-GUID 的成员资格。 在此处输入图片描述

host1# ip link sho
24: ib1.8001@ib1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 4092 qdisc pfifo_fast state LOWERLAYERDOWN mode DEFAULT group default qlen 256
    link/infiniband 80:00:02:1f:fe:80:00:00:00:00:00:00:00:02:c9:03:00:10:df:5a brd 00:ff:ff:ff:ff:12:40:1b:80:01:00:00:00:00:00:00:ff:ff:ff:ff

host2# ip link sho
16: ib1.8001@ib1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 4092 qdisc pfifo_fast state LOWERLAYERDOWN mode DEFAULT group default qlen 256
    link/infiniband 80:00:02:1e:fe:80:00:00:00:00:00:00:e4:1d:2d:03:00:e0:88:02 brd 00:ff:ff:ff:ff:12:40:1b:80:01:00:00:00:00:00:00:ff:ff:ff:ff

host2# ibnodes
Ca      : 0xe41d2d0300e08800 ports 2 "MT25408 ConnectX Mellanox Technologies"
Ca      : 0x0002c9030010df58 ports 2 "MT25408 ConnectX Mellanox Technologies"
Switch  : 0xf452140300823b60 ports 36 "MF0;msx6036:SX6036/U1" enhanced port 0 lid 1 lmc 0

host2# ibstat
CA 'mlx4_0'
    CA type: MT4099
    Number of ports: 2
    Firmware version: 2.34.5000
    Hardware version: 0
    Node GUID: 0xe41d2d0300e08800
    System image GUID: 0xe41d2d0300e08803
    Port 1:
            State: Down
            Physical state: Polling
            Rate: 10
            Base lid: 6
            LMC: 0
            SM lid: 1
            Capability mask: 0x0251486a
            Port GUID: 0xe41d2d0300e08801
            Link layer: InfiniBand
    Port 2:
            State: Active
            Physical state: LinkUp
            Rate: 40 (FDR10)
            Base lid: 2
            LMC: 0
            SM lid: 5
            Capability mask: 0x0251486a
            Port GUID: 0xe41d2d0300e08802
            Link layer: InfiniBand



host2# ibdiagnet
Loading IBDIAGNET from: /usr/lib/x86_64-linux-gnu/ibdiagnet1.5.7
-W- Topology file is not specified.
    Reports regarding cluster links will use direct routes.
Loading IBDM from: /usr/lib/x86_64-linux-gnu/ibdm1.5.7
-I- Using port 2 as the local port.
-I- Discovering ... 3 nodes (1 Switches & 2 CA-s) discovered.


-I---------------------------------------------------
-I- Bad Guids/LIDs Info
-I---------------------------------------------------
-I- No bad Guids were found

-I---------------------------------------------------
-I- Links With Logical State = INIT
-I---------------------------------------------------
-I- No bad Links (with logical state = INIT) were found

-I---------------------------------------------------
-I- General Device Info
-I---------------------------------------------------

-I---------------------------------------------------
-I- PM Counters Info
-I---------------------------------------------------
-I- No illegal PM counters values were found

-I---------------------------------------------------
-I- Fabric Partitions Report (see ibdiagnet.pkey for a full hosts list)
-I---------------------------------------------------
-I-    PKey:0x7fff Hosts:2 full:2 limited:0

-I---------------------------------------------------
-I- IPoIB Subnets Check
-I---------------------------------------------------
-I- Subnet: IPv4 PKey:0x7fff QKey:0x00000b1b MTU:2048Byte rate:10Gbps SL:0x00
-W- Suboptimal rate for group. Lowest member rate:40Gbps > group-rate:10Gbps

-I---------------------------------------------------
-I- Bad Links Info
-I- No bad link were found
-I---------------------------------------------------
----------------------------------------------------------------
-I- Stages Status Report:
    STAGE                                    Errors Warnings
    Bad GUIDs/LIDs Check                     0      0     
    Link State Active Check                  0      0     
    General Devices Info Report              0      0     
    Performance Counters Report              0      0     
    Partitions Check                         0      0     
    IPoIB Subnets Check                      0      1     

答案1

InfiniBand 分区键使用最高位 (0x8000) 来指定主机是否为分区的正式成员。有限成员身份意味着主机只能与分区的正式成员进行通信,而正式成员可以与有限成员和正式成员进行通信。

对于您的情况下,请尝试将交换机中的分区键设置为 0x1,并将主机设置为完全成员身份。

相关内容