简而言之:在两台服务器之间设置 InfiniBand 连接时,我无法完成 RDMA 延迟测试。即使断开 ssh 连接,它也会崩溃。
长话短说。第一台服务器是 Xen 4.4,Ubuntu 14.04 作为 dom0(主机名为 xen)。第二台服务器是一台普通服务器,Ubuntu 14.04(主机名为 node3)。它们都通过 IB 交换机连接了 Mellanox MT25208 HCA。两者都加载了所有内核模块,安装了 OpenSM。IPoIB 工作正常。裸机ibping
双向运行,xen -> node3 和 node3 -> xen。当我尝试测试时,问题出现了。以下是导致xen 崩溃的ib_rdma_lat
步骤。ib_rdma_lat
sshd
- 在 xen 上我运行
ib_rdma_lat
。 - 在 node3 上运行
ib_rdma_lat xen
- 与 xen 的 ssh 连接关闭。
- 这是 ssh 连接关闭之前的输出。
root@xen:~/tmp/22# ib_rdma_lat local address: LID 0x03 QPN 0x10406 PSN 0x9f903b RKey 0x40004000 VAddr 0x000000017e4001 remote address: LID 0x01 QPN 0x10406 PSN 0xd8c16e RKey 0x20004000 VAddr 0x000000013fd001 Connection to xen closed by remote host. Connection to xen closed.
我谷歌了一下,唯一能做的就是调整ib_mthca
模块参数num_mtt
和log_mtts_per_seg
。正如文章中所说http://community.mellanox.com/docs/DOC-1120。我在两台服务器上将它们设置为num_mtt=4194304
和log_mtts_per_seg=4
。我这样做的同时还试验了这些值,以便模块ib_mthca
能够正确加载。但这没有帮助。ib_rdma_lat
仍然在 xen 上崩溃。这是日志:
Aug 4 00:12:52 localhost kernel: [ 4011.170180] ib_rdma_lat invoked oom-killer: gfp_mask=0x0, order=0, oom_score_adj=0
Aug 4 00:12:52 localhost kernel: [ 4011.170189] ib_rdma_lat cpuset=/ mems_allowed=0
Aug 4 00:12:52 localhost kernel: [ 4011.170195] CPU: 0 PID: 2889 Comm: ib_rdma_lat Tainted: G B W 3.13.0-32-generic #57-Ubuntu
Aug 4 00:12:52 localhost kernel: [ 4011.170198] Hardware name: Supermicro X9DRFF-iG+/-7G+/-iTG+/-7TG+/X9DRFF-iG+/-7G+/-iTG+/-7TG+, BIOS 3.0 07/29/2013
Aug 4 00:12:52 localhost kernel: [ 4011.170202] 0000000000000000 ffff880f175ebc68 ffffffff8171bcb4 ffff880f1ae02fe0
Aug 4 00:12:52 localhost kernel: [ 4011.170209] ffff880f175ebcf0 ffffffff817165ef ffff880f1a96afe0 0000000000000000
Aug 4 00:12:52 localhost kernel: [ 4011.170213] 00000000016ad5c1 ffff880f1a96afe0 ffffffff817246aa ffffffff8172417b
Aug 4 00:12:52 localhost kernel: [ 4011.170217] Call Trace:
Aug 4 00:12:52 localhost kernel: [ 4011.170236] [<ffffffff8171bcb4>] dump_stack+0x45/0x56
Aug 4 00:12:52 localhost kernel: [ 4011.170242] [<ffffffff817165ef>] dump_header+0x7f/0x1f1
Aug 4 00:12:52 localhost kernel: [ 4011.170248] [<ffffffff817246aa>] ? error_exit+0x2a/0x60
Aug 4 00:12:52 localhost kernel: [ 4011.170253] [<ffffffff8172417b>] ? retint_restore_args+0x5/0x6
Aug 4 00:12:52 localhost kernel: [ 4011.170260] [<ffffffff81151bfe>] oom_kill_process+0x1ce/0x330
Aug 4 00:12:52 localhost kernel: [ 4011.170269] [<ffffffff812d3ac5>] ? security_capable_noaudit+0x15/0x20
Aug 4 00:12:52 localhost kernel: [ 4011.170273] [<ffffffff81152334>] out_of_memory+0x414/0x450
Aug 4 00:12:52 localhost kernel: [ 4011.170278] [<ffffffff811523df>] pagefault_out_of_memory+0x6f/0x80
Aug 4 00:12:52 localhost kernel: [ 4011.170284] [<ffffffff81714c38>] mm_fault_error+0x8e/0x180
Aug 4 00:12:52 localhost kernel: [ 4011.170289] [<ffffffff81727f01>] __do_page_fault+0x4a1/0x560
Aug 4 00:12:52 localhost kernel: [ 4011.170299] [<ffffffff81111116>] ? __acct_update_integrals+0x76/0xe0
Aug 4 00:12:52 localhost kernel: [ 4011.170305] [<ffffffff8111155c>] ? acct_account_cputime+0x1c/0x20
Aug 4 00:12:52 localhost kernel: [ 4011.170312] [<ffffffff8109d7db>] ? account_user_time+0x8b/0xa0
Aug 4 00:12:52 localhost kernel: [ 4011.170316] [<ffffffff8109ddf4>] ? vtime_account_user+0x54/0x60
Aug 4 00:12:52 localhost kernel: [ 4011.170320] [<ffffffff81727fda>] do_page_fault+0x1a/0x70
Aug 4 00:12:52 localhost kernel: [ 4011.170324] [<ffffffff81724448>] page_fault+0x28/0x30
Aug 4 00:12:52 localhost kernel: [ 4011.170326] Mem-Info:
Aug 4 00:12:52 localhost kernel: [ 4011.170329] Node 0 DMA per-cpu:
Aug 4 00:12:52 localhost kernel: [ 4011.170334] CPU 0: hi: 0, btch: 1 usd: 0
Aug 4 00:12:52 localhost kernel: [ 4011.170336] Node 0 DMA32 per-cpu:
Aug 4 00:12:52 localhost kernel: [ 4011.170339] CPU 0: hi: 186, btch: 31 usd: 135
Aug 4 00:12:52 localhost kernel: [ 4011.170341] Node 0 Normal per-cpu:
Aug 4 00:12:52 localhost kernel: [ 4011.170344] CPU 0: hi: 186, btch: 31 usd: 124
Aug 4 00:12:52 localhost kernel: [ 4011.170351] active_anon:7920 inactive_anon:23 isolated_anon:0
Aug 4 00:12:52 localhost kernel: [ 4011.170351] active_file:20177 inactive_file:37521 isolated_file:0
Aug 4 00:12:52 localhost kernel: [ 4011.170351] unevictable:8 dirty:0 writeback:0 unstable:0
Aug 4 00:12:52 localhost kernel: [ 4011.170351] free:15211440 slab_reclaimable:4583 slab_unreclaimable:8427
Aug 4 00:12:52 localhost kernel: [ 4011.170351] mapped:4644 shmem:408 pagetables:993 bounce:0
Aug 4 00:12:52 localhost kernel: [ 4011.170351] free_cma:0
Aug 4 00:12:52 localhost kernel: [ 4011.170358] Node 0 DMA free:15888kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15972kB managed:15888kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Aug 4 00:12:52 localhost kernel: [ 4011.170367] lowmem_reserve[]: 0 1980 60135 60135
Aug 4 00:12:52 localhost kernel: [ 4011.170372] Node 0 DMA32 free:2017364kB min:1032kB low:1288kB high:1548kB active_anon:992kB inactive_anon:4kB active_file:2596kB inactive_file:5756kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:2045472kB managed:2031128kB mlocked:0kB dirty:0kB writeback:0kB mapped:692kB shmem:32kB slab_reclaimable:428kB slab_unreclaimable:472kB kernel_stack:40kB pagetables:132kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Aug 4 00:12:52 localhost kernel: [ 4011.170381] lowmem_reserve[]: 0 0 58154 58154
Aug 4 00:12:52 localhost kernel: [ 4011.170386] Node 0 Normal free:58812508kB min:30348kB low:37932kB high:45520kB active_anon:30688kB inactive_anon:88kB active_file:78112kB inactive_file:144328kB unevictable:32kB isolated(anon):0kB isolated(file):0kB present:60853112kB managed:59550432kB mlocked:32kB dirty:0kB writeback:0kB mapped:17884kB shmem:1600kB slab_reclaimable:17904kB slab_unreclaimable:33236kB kernel_stack:1704kB pagetables:3840kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Aug 4 00:12:52 localhost kernel: [ 4011.170394] lowmem_reserve[]: 0 0 0 0
Aug 4 00:12:52 localhost kernel: [ 4011.170398] Node 0 DMA: 0*4kB 0*8kB 1*16kB (U) 0*32kB 2*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15888kB
Aug 4 00:12:52 localhost kernel: [ 4011.170416] Node 0 DMA32: 1*4kB (M) 12*8kB (UEM) 7*16kB (UE) 2*32kB (UM) 1*64kB (U) 2*128kB (UM) 0*256kB 1*512kB (E) 1*1024kB (E) 2*2048kB (ER) 491*4096kB (M) = 2017364kB
Aug 4 00:12:52 localhost kernel: [ 4011.170434] Node 0 Normal: 67*4kB (UM) 34*8kB (UEM) 16*16kB (UEM) 38*32kB (UM) 26*64kB (UM) 22*128kB (UEM) 15*256kB (UEM) 2*512kB (M) 1*1024kB (M) 3*2048kB (UEM) 14354*4096kB (MR) = 58812508kB
Aug 4 00:12:52 localhost kernel: [ 4011.170468] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Aug 4 00:12:52 localhost kernel: [ 4011.170470] 58105 total pagecache pages
Aug 4 00:12:52 localhost kernel: [ 4011.170473] 0 pages in swap cache
Aug 4 00:12:52 localhost kernel: [ 4011.170476] Swap cache stats: add 0, delete 0, find 0/189
Aug 4 00:12:52 localhost kernel: [ 4011.170478] Free swap = 33517564kB
Aug 4 00:12:52 localhost kernel: [ 4011.170480] Total swap = 33517564kB
Aug 4 00:12:52 localhost kernel: [ 4011.170482] 15728639 pages RAM
Aug 4 00:12:52 localhost kernel: [ 4011.170483] 0 pages HighMem/MovableOnly
Aug 4 00:12:52 localhost kernel: [ 4011.170485] 325670 pages reserved
Aug 4 00:12:52 localhost kernel: [ 4011.170487] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
Aug 4 00:12:52 localhost kernel: [ 4011.170496] [ 375] 0 375 4935 228 14 0 0 upstart-udev-br
Aug 4 00:12:52 localhost kernel: [ 4011.170501] [ 384] 0 384 12927 485 28 0 -1000 systemd-udevd
Aug 4 00:12:52 localhost kernel: [ 4011.170505] [ 571] 102 571 9887 391 23 0 0 dbus-daemon
Aug 4 00:12:52 localhost kernel: [ 4011.170509] [ 590] 101 590 63961 318 27 0 0 rsyslogd
Aug 4 00:12:52 localhost kernel: [ 4011.170513] [ 596] 0 596 4823 373 14 0 0 bluetoothd
Aug 4 00:12:52 localhost kernel: [ 4011.170516] [ 606] 0 606 18680 893 40 0 0 cupsd
Aug 4 00:12:52 localhost kernel: [ 4011.170520] [ 614] 0 614 5870 106 16 0 0 rpc.idmapd
Aug 4 00:12:52 localhost kernel: [ 4011.170523] [ 622] 0 622 10863 454 26 0 0 systemd-logind
Aug 4 00:12:52 localhost kernel: [ 4011.170528] [ 702] 0 702 3984 308 13 0 0 upstart-file-br
Aug 4 00:12:52 localhost kernel: [ 4011.170531] [ 877] 0 877 5855 275 18 0 0 rpcbind
Aug 4 00:12:52 localhost kernel: [ 4011.170534] [ 898] 111 898 5386 347 15 0 0 rpc.statd
Aug 4 00:12:52 localhost kernel: [ 4011.170538] [ 901] 0 901 3848 184 13 0 0 upstart-socket-
Aug 4 00:12:52 localhost kernel: [ 4011.170541] [ 1300] 105 1300 7861 513 21 0 0 ntpd
Aug 4 00:12:52 localhost kernel: [ 4011.170545] [ 1374] 0 1374 5268 237 13 0 0 getty
Aug 4 00:12:52 localhost kernel: [ 4011.170548] [ 1378] 0 1378 5268 235 13 0 0 getty
Aug 4 00:12:52 localhost kernel: [ 4011.170551] [ 1384] 0 1384 5268 237 13 0 0 getty
Aug 4 00:12:52 localhost kernel: [ 4011.170555] [ 1385] 0 1385 5268 238 13 0 0 getty
Aug 4 00:12:52 localhost kernel: [ 4011.170558] [ 1388] 0 1388 5268 238 13 0 0 getty
Aug 4 00:12:52 localhost kernel: [ 4011.170561] [ 1427] 0 1427 15341 762 33 0 -1000 sshd
Aug 4 00:12:52 localhost kernel: [ 4011.170564] [ 1443] 0 1443 5914 257 17 0 0 cron
Aug 4 00:12:52 localhost kernel: [ 4011.170568] [ 1554] 0 1554 2750 242 11 0 0 xenstored
Aug 4 00:12:52 localhost kernel: [ 4011.170571] [ 1566] 0 1566 22752 261 19 0 0 xenconsoled
Aug 4 00:12:52 localhost kernel: [ 4011.170575] [ 1613] 0 1613 73631 1045 48 0 0 polkitd
Aug 4 00:12:52 localhost kernel: [ 4011.170578] [ 1885] 113 1885 7052 249 18 0 0 dnsmasq
Aug 4 00:12:52 localhost kernel: [ 4011.170581] [ 2004] 0 2004 148275 997 39 0 0 console-kit-dae
Aug 4 00:12:52 localhost kernel: [ 4011.170585] [ 2166] 0 2166 23985 237 21 0 0 xl
Aug 4 00:12:52 localhost kernel: [ 4011.170589] [ 2303] 0 2303 5268 237 13 0 0 getty
Aug 4 00:12:52 localhost kernel: [ 4011.170592] [ 2378] 0 2378 82712 784 23 0 0 opensm
Aug 4 00:12:52 localhost kernel: [ 4011.170595] [ 2379] 0 2379 65942 358 22 0 0 opensm
Aug 4 00:12:52 localhost kernel: [ 4011.170598] [ 2450] 106 2450 91259 1269 74 0 0 whoopsie
Aug 4 00:12:52 localhost kernel: [ 4011.170602] [ 2453] 0 2453 93762 3220 114 0 0 libvirtd
Aug 4 00:12:52 localhost kernel: [ 4011.170605] [ 2634] 0 2634 26407 1058 54 0 0 sshd
Aug 4 00:12:52 localhost kernel: [ 4011.170608] [ 2671] 1000 2671 26407 501 52 0 0 sshd
Aug 4 00:12:52 localhost kernel: [ 4011.170612] [ 2672] 1000 2672 7041 1040 17 0 0 bash
Aug 4 00:12:52 localhost kernel: [ 4011.170615] [ 2749] 0 2749 17566 547 36 0 0 sudo
Aug 4 00:12:52 localhost kernel: [ 4011.170618] [ 2750] 0 2750 7063 1074 16 0 0 bash
Aug 4 00:12:52 localhost kernel: [ 4011.170622] [ 2889] 0 2889 3732 213 12 0 0 ib_rdma_lat
Aug 4 00:12:52 localhost kernel: [ 4011.170625] Out of memory: Kill process 2453 (libvirtd) score 0 or sacrifice child
Aug 4 00:12:52 localhost kernel: [ 4011.170729] Killed process 2453 (libvirtd) total-vm:375048kB, anon-rss:4748kB, file-rss:8132kB
xen(dom0)有 60GB 的 RAM。node3 有 180GB 的 RAM。以下是我为诊断问题而制作的一些日志和命令输出。
dmesg
在 xen 上https://dl.dropboxusercontent.com/u/8057759/ib_mthca/dmesg.xen.logxl dmesg
在 xen 上https://dl.dropboxusercontent.com/u/8057759/ib_mthca/xl-dmesg.xen.logib_mthca
xen 上加载的参数https://dl.dropboxusercontent.com/u/8057759/ib_mthca/ib_mthca.xen.logibhosts
在 xen 上https://dl.dropboxusercontent.com/u/8057759/ib_mthca/ibhosts.xen.logibstat
在 xen 上https://dl.dropboxusercontent.com/u/8057759/ib_mthca/ibstat.xen.logibstatus
在 xen 上https://dl.dropboxusercontent.com/u/8057759/ib_mthca/ibstatus.xen.loglsmod | grep rdma
在 xen 上https://dl.dropboxusercontent.com/u/8057759/ib_mthca/lsmod-rdma.xen.loglspci -s 04:00.0 -k
在 xen 上https://dl.dropboxusercontent.com/u/8057759/ib_mthca/lspci.xen.log- xen 崩溃
/var/log/syslog
后的截图ib_rdma_lat
https://dl.dropboxusercontent.com/u/8057759/ib_mthca/syslog.xen
有人可以给我任何建议吗?
答案1
几点观察:
- 您正在运行很古老设备/驱动程序(mthca)相对较新的操作系统(Ubuntu 14.04)。除非有人纠正我,否则我不相信有人真正测试过这个。你从哪里得到这个驱动程序?它是哪个 OFED 版本?
- 测试被 oom-killer 终止,这意味着你内存不足。MTT 的东西与注册/pin 功能有关更多的内存 - 当您拥有更多物理内存时,您可以使用默认驱动程序配置来固定。这不是您的问题。请恢复默认设置。
现在,我只是在猜测,但可能是您在机器上启用了 THP(透明大页面),而测试实际上不知道如何处理它们,这导致它分配大量页面,就好像它们是普通页面一样。
请禁用 THP然后再试一次。