Dell M620 Xenserver 网络问题

Dell M620 Xenserver 网络问题

我在带有 BCM57810 网卡的 Dell M620 上运行 Xenserver 7.0(也尝试过 7.1)时遇到了一个非常奇怪的问题。

整个设置很好,运行完美,没有流量。我有一个 Windows Server 2016 正在运行,可以通过 Vyos 防火墙等使用 RDC 访问它。在另一台虚拟机上,我想运行一个 owncloud 实例,并将另一个 IP 添加到网络接口并将流量转发给它。一旦我访问 owncloud http 接口,整个服务器就会崩溃,出现内核恐慌和与 Broadcom 网络驱动程序相关的错误消息。

device tap13.0 left promiscuous mode
device vif13.0 left promiscuous mode
------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x1a4/0x280()
NETDEV WATCHDOG: eth0 (bnx2x): transmit queue 0 timed out
Modules linked in: btrfs zlib_deflate raid6_pq xor xfs tun nfsv3 nfs fscache bnx2fc(O) cnic(O) uio fcoe libfcoe libfc scsi_transport_fc scsi_tgt openvswitch(O) gre 8021q garp mrp stp llc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_multiport dm_multipath xt_conntrack nf_conntrack iptable_filter ipmi_devintf coretemp crc32_pclmul aesni_intel aes_x86_64 ablk_helper cryptd lrw lpc_ich mfd_core sg ipmi_si ipmi_msghandler wmi sb_edac edac_core hed shpchp nfsd auth_rpcgss oid_registry nfs_acl lockd nls_utf8 isofs sunrpc ip_tables x_tables hid_generic usbhid hid sd_mod ahci libahci libata bnx2x(O) ehci_pci ehci_hcd mdio libcrc32c ptp megaraid_sas(O) pps_core scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc scsi_dh_alua scsi_dh scsi_mod ipv6 autofs4
CPU: 6 PID: 0 Comm: swapper/6 Tainted: G           O 3.10.0+10 #1
Hardware name: Dell Inc. PowerEdge M620/0VHRN7, BIOS 2.5.4 01/27/2016
 0000000000000009 ffff8801354c3d58 ffffffff815427c7 ffff8801354c3d90
 ffffffff81054da1 ffff88012e210000 0000000000000000 0000000000000006
 ffff88012efe7100 ffff88012efe7080 ffff8801354c3df0 ffffffff81054e0c
Call Trace:
 <IRQ>  [<ffffffff815427c7>] dump_stack+0x19/0x1b
 [<ffffffff81054da1>] warn_slowpath_common+0x61/0x80
 [<ffffffff81054e0c>] warn_slowpath_fmt+0x4c/0x50
 [<ffffffff8149cd44>] dev_watchdog+0x1a4/0x280
 [<ffffffff8149cba0>] ? dev_deactivate_queue.constprop.29+0x60/0x60
 [<ffffffff81063cd3>] call_timer_fn+0x53/0x130
 [<ffffffff8149cba0>] ? dev_deactivate_queue.constprop.29+0x60/0x60
 [<ffffffff810658fd>] run_timer_softirq+0x22d/0x290
 [<ffffffff8105d48b>] __do_softirq+0xfb/0x240
 [<ffffffff8155255c>] call_softirq+0x1c/0x30
 [<ffffffff81014203>] do_softirq+0x43/0x80
 [<ffffffff8105d6d9>] irq_exit+0x49/0xa0
 [<ffffffff81384b55>] xen_evtchn_do_upcall+0x35/0x50
 [<ffffffff815525be>] xen_do_hypervisor_callback+0x1e/0xa0
 <EOI>  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
 [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
 [<ffffffff8100a340>] ? xen_safe_halt+0x10/0x30
 [<ffffffff8101a844>] ? default_idle+0x44/0xd0
 [<ffffffff8101b038>] ? arch_cpu_idle+0x18/0x30
 [<ffffffff810a3532>] ? cpu_startup_entry+0x1c2/0x280
 [<ffffffff8152e11d>] ? cpu_bringup_and_idle+0x13/0x15
---[ end trace 3267d319304e6e4c ]---
ULP_STOP
bnx2fc: ERROR:bnx2fc_destroy_timer - Destroy compl not received!!
bnx2x: [bnx2x_stats_comp:211(eth0)]timeout waiting for stats finished
bnx2x: [bnx2x_stats_comp:211(eth0)]timeout waiting for stats finished
[bnx2x_clean_tx_queue:1624(eth0)]timeout waiting for queue[0]: txdata->tx_pkt_prod(17962) != txdata->tx_pkt_cons(17955)
[bnx2x_clean_tx_queue:1624(eth0)]timeout waiting for queue[24]: txdata->tx_pkt_prod(49476) != txdata->tx_pkt_cons(49474)
[bnx2x_clean_tx_queue:1624(eth0)]timeout waiting for queue[0]: txdata->tx_pkt_prod(17962) != txdata->tx_pkt_cons(17955)
[bnx2x_clean_tx_queue:1624(eth0)]timeout waiting for queue[24]: txdata->tx_pkt_prod(49476) != txdata->tx_pkt_cons(49474)
[bnx2x_state_wait:329(eth0)]timeout waiting for state 0
bnx2x: [bnx2x_del_all_macs:9335(eth0)]Failed to delete MACs: -16
bnx2x: [bnx2x_chip_cleanup:10164(eth0)]Failed to schedule DEL commands for UC MACs list: -16
[bnx2x_state_wait:329(eth0)]timeout waiting for state 9
[bnx2x_state_wait:329(eth0)]timeout waiting for state 2
bnx2x: [bnx2x_func_stop:9935(eth0)]FUNC_STOP ramrod failed. Running a dry transaction
bnx2x: [bnx2x_issue_dmae_with_comp:757(eth0)]DMAE timeout!
bnx2x: [bnx2x_write_dmae:806(eth0)]DMAE returned failure -1
bnx2x: [bnx2x_issue_dmae_with_comp:757(eth0)]DMAE timeout!
bnx2x: [bnx2x_write_dmae:806(eth0)]DMAE returned failure -1
bnx2x: [bnx2x_issue_dmae_with_comp:757(eth0)]DMAE timeout!
bnx2x: [bnx2x_write_dmae:806(eth0)]DMAE returned failure -1
bnx2x: [bnx2x_issue_dmae_with_comp:757(eth0)]DMAE timeout!
bnx2x: [bnx2x_write_dmae:806(eth0)]DMAE returned failure -1

网络图如下: 在此处输入图片描述

不幸的是,我无法安装供应商驱动程序,因为我没有内核头文件来手动编译驱动程序。

我尝试在 NIC 配置中禁用虚拟接口,但没有成功。此外,disable_tpa 或其他模块参数也没有给我任何成功。

希望有人有任何想法。

答案1

我最近在 Xenserver 7.1 和 Ubuntu VM 上也遇到了同样的问题

服务器戴尔R730

NIC Broadcom Limited NetXtreme II BCM57800 1/10 千兆以太网(修订版 10)

就我而言,问题在于 VLAN 处理。

当我尝试处理 Xen 上的 Vlan 并将 4 个虚拟网卡与 Xenserver 中选定的 Vlan 连接到 VM 时 - 启动该 VM 后 7-10 分钟内整个硬件服务器反复崩溃。

一种解决方法是将整个 eth0 接口传递给 VM,然后在 VM 内部处理 Vlan(eth0.100、eth0.200 等)

相关内容