我有一台 Ubuntu 16.04 服务器,它位于一个局域网中,有几十台机器需要通过 Samba 共享对其进行读写。它运行的是一张千兆卡,但我决定尝试绑定以提高服务器的整体传输速率。我安装了四张 1 千兆卡,并已成功配置了 bond0 接口,如下所示
#> cat /etc/network/interfaces
# The loopback network interface
auto lo
iface lo inet loopback
auto enp1s6f0
iface enp1s6f0 inet manual
bond-master bond0
auto enp1s6f1
iface enp1s6f1 inet manual
bond-master bond0
auto enp1s7f0
iface enp1s7f0 inet manual
bond-master bond0
auto enp1s7f1
iface enp1s7f1 inet manual
bond-master bond0
# The primary network interface
auto bond0
iface bond0 inet static
address 192.168.111.8
netmask 255.255.255.0
network 192.168.111.0
broadcast 192.168.111.255
gateway 192.168.111.1
dns-nameservers 192.168.111.11
bond-mode 6
bond-miimon 100
bond-lacp-rate 1
bond-slaves enp1s6f0 enp1s6f1 enp1s7f0 enp1s7f1
#> IP 地址
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: enp1s6f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP group default qlen 1000
link/ether 00:09:6b:1a:03:6c brd ff:ff:ff:ff:ff:ff
3: enp1s6f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP group default qlen 1000
link/ether 00:09:6b:1a:03:6d brd ff:ff:ff:ff:ff:ff
4: enp1s7f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP group default qlen 1000
link/ether 00:09:6b:1a:01:ba brd ff:ff:ff:ff:ff:ff
5: enp1s7f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master bond0 state UP group default qlen 1000
link/ether 00:09:6b:1a:01:bb brd ff:ff:ff:ff:ff:ff
6: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:09:6b:1a:03:6d brd ff:ff:ff:ff:ff:ff
inet 192.168.111.8/24 brd 192.168.111.255 scope global bond0
valid_lft forever preferred_lft forever
inet6 fe80::209:6bff:fe1a:36d/64 scope link
valid_lft forever preferred_lft forever
#> 如果配置
bond0 Link encap:Ethernet HWaddr 00:09:6b:1a:03:6d
inet addr:192.168.111.8 Bcast:192.168.111.255 Mask:255.255.255.0
inet6 addr: fe80::209:6bff:fe1a:36d/64 Scope:Link
UP BROADCAST RUNNING MASTER MULTICAST MTU:1500 Metric:1
RX packets:30848499 errors:0 dropped:45514 overruns:0 frame:0
TX packets:145615150 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3344795597 (3.3 GB) TX bytes:407934338759 (407.9 GB)
enp1s6f0 Link encap:Ethernet HWaddr 00:09:6b:1a:03:6c
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:7260526 errors:0 dropped:15171 overruns:0 frame:0
TX packets:36216191 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:453705851 (453.7 MB) TX bytes:101299060589 (101.2 GB)
enp1s6f1 Link encap:Ethernet HWaddr 00:09:6b:1a:03:6d
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:8355652 errors:0 dropped:0 overruns:0 frame:0
TX packets:38404078 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:513634676 (513.6 MB) TX bytes:107762014012 (107.7 GB)
enp1s7f0 Link encap:Ethernet HWaddr 00:09:6b:1a:01:ba
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:6140007 errors:0 dropped:15171 overruns:0 frame:0
TX packets:36550756 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:382222165 (382.2 MB) TX bytes:102450666514 (102.4 GB)
enp1s7f1 Link encap:Ethernet HWaddr 00:09:6b:1a:01:bb
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:9092314 errors:0 dropped:15171 overruns:0 frame:0
TX packets:34444125 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:1995232905 (1.9 GB) TX bytes:96422597644 (96.4 GB)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:35 errors:0 dropped:0 overruns:0 frame:0
TX packets:35 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1
RX bytes:2640 (2.6 KB) TX bytes:2640 (2.6 KB)
使用 8 台 Windows 机器复制 2 TB 文件来测试传输速率。
#> iftop -B -i bond0
25.5MB 50.9MB 76.4MB 102MB 127MB
+-------------------------------------------------------------------------
192.168.111.8 => 192.168.111.186 11.8MB 12.4MB 14.7MB
<= 126KB 124KB 102KB
192.168.111.8 => 192.168.111.181 12.4MB 12.1MB 7.83MB
<= 121KB 105KB 55.1KB
192.168.111.8 => 192.168.111.130 11.5MB 11.0MB 12.6MB
<= 106KB 88.5KB 77.1KB
192.168.111.8 => 192.168.111.172 10.4MB 10.9MB 14.2MB
<= 105KB 100KB 92.2KB
192.168.111.8 => 192.168.111.179 9.76MB 9.86MB 4.20MB
<= 101KB 77.0KB 28.8KB
192.168.111.8 => 192.168.111.182 9.57MB 9.72MB 5.97MB
<= 91.4KB 72.4KB 37.9KB
192.168.111.8 => 192.168.111.161 8.01MB 9.51MB 12.9MB
<= 71.5KB 60.6KB 72.7KB
192.168.111.8 => 192.168.111.165 9.46MB 5.29MB 1.32MB
<= 100.0KB 58.2KB 14.6KB
192.168.111.8 => 192.168.111.11 73B 136B 56B
<= 112B 198B 86B
192.168.111.255 => 192.168.111.132 0B 0B 0B
<= 291B 291B 291B
--------------------------------------------------------------------------
TX: cum: 3.61GB peak: 85rates: 83.0MB 80.7MB 73.7MB
RX: 22.0MB 823KB 823KB 687KB 481KB
TOTAL: 3.63GB 86.0MB 83.8MB 81.4MB 74.2MB
正如您在 iftop 上看到的,我的传输速率仅为 80MB/s 左右,与仅使用单张网卡时获得的传输速率大致相同。我的 CPU 大约有 90% 处于空闲状态,数据正在读取/写入 14 个驱动器 ZFS,因此我认为我没有任何驱动器瓶颈。我没有任何花哨的交换机,只有基本的 Netgear ProSafe 交换机,如下所示:http://www.newegg.com/Product/Product.aspx?Item=N82E16833122058但是我读到的关于模式 5 和 6 的所有内容都表明不需要特殊开关。我不需要单个连接超过 1GB,但我希望所有总连接可以超过 1GB。我是否遗漏了其他配置设置,或者 Samba 是否存在某些限制?如果绑定不能满足我的要求,还有其他解决方案吗?SMB3 是否已准备好用于多通道生产?
编辑如下:
以下是 Tom 要求的命令的输出。
iostat -dx 5 复制代码
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0.00 0.00 489.00 11.80 6400.00 45.60 25.74 0.25 0.49 0.46 1.81 0.30 14.94
sdc 0.00 0.00 476.40 11.40 6432.80 44.00 26.56 0.28 0.57 0.55 1.61 0.32 15.76
sda 0.00 0.00 486.00 11.20 6374.40 43.20 25.81 0.26 0.53 0.50 1.84 0.31 15.36
sdh 0.00 0.00 489.60 13.00 6406.40 50.40 25.69 0.26 0.52 0.48 1.72 0.31 15.38
sdf 0.00 0.00 494.00 12.60 6376.00 48.80 25.36 0.26 0.52 0.49 1.67 0.31 15.88
sdd 0.00 0.00 481.60 12.00 6379.20 46.40 26.04 0.29 0.60 0.57 1.75 0.34 16.68
sde 0.00 0.00 489.80 12.20 6388.00 47.20 25.64 0.30 0.59 0.56 1.82 0.34 16.88
sdg 0.00 0.00 487.40 13.00 6400.80 50.40 25.78 0.27 0.53 0.50 1.75 0.32 16.24
sdj 0.00 0.00 481.40 11.40 6427.20 44.00 26.26 0.28 0.56 0.54 1.74 0.33 16.10
sdi 0.00 0.00 483.80 11.60 6424.00 44.80 26.12 0.26 0.52 0.49 1.67 0.31 15.14
sdk 0.00 0.00 492.60 8.60 6402.40 32.80 25.68 0.25 0.49 0.46 2.28 0.31 15.42
sdm 0.00 0.00 489.80 10.40 6421.60 40.00 25.84 0.25 0.51 0.47 2.23 0.32 16.18
sdn 0.00 0.00 489.60 10.00 6404.80 39.20 25.80 0.24 0.49 0.46 1.92 0.29 14.38
sdl 0.00 0.00 498.40 8.40 6392.00 32.00 25.35 0.25 0.50 0.47 1.93 0.31 15.48
sdo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
#> zpool iostat -v 5
capacity operations bandwidth
pool alloc free read write read write
---------------------------------------------- ----- ----- ----- ----- ----- -----
backup 28.9T 9.13T 534 0 65.9M 0
raidz2 28.9T 9.13T 534 0 65.9M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHT17HA - - 422 0 4.77M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHSRD6A - - 413 0 4.79M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHRZWYA - - 415 0 4.78M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHSRS2A - - 417 0 4.77M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHR2DPA - - 397 0 4.83M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHN0P0A - - 418 0 4.78M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHU34LA - - 419 0 4.76M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHRHUEA - - 417 0 4.78M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHM0HBA - - 413 0 4.78M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHJG4LA - - 410 0 4.79M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHST58A - - 417 0 4.78M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHS0G5A - - 418 0 4.78M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHN2D4A - - 414 0 4.80M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHR2G5A - - 417 0 4.79M 0
---------------------------------------------- ----- ----- ----- ----- ----- -----
因此,我的办公室里确实有几台交换机,但目前这台机器的所有四个网络端口都插入了客户端 Windows 机器所连接的同一个 24 端口交换机,因此所有这些流量都应该包含在这台交换机内。到互联网和我们内部 DNS 的流量需要通过链接到另一个交换机,但我认为这不会影响这个问题。
编辑#2,添加了一些附加信息
#> 猫/ proc / net / bonding / bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
Bonding Mode: adaptive load balancing
Primary Slave: None
Currently Active Slave: enp1s6f1
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Slave Interface: enp1s6f1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:09:6b:1a:03:6d
Slave queue ID: 0
Slave Interface: enp1s6f0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:09:6b:1a:03:6c
Slave queue ID: 0
Slave Interface: enp1s7f0
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:09:6b:1a:01:ba
Slave queue ID: 0
Slave Interface: enp1s7f1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:09:6b:1a:01:bb
Slave queue ID: 0
编辑编号 3
# >zfs 列表 -o 名称、记录大小、压缩
NAME RECSIZE COMPRESS
backup 128K off
backup/Accounting 128K off
backup/Archive 128K off
backup/Documents 128K off
backup/Library 128K off
backup/Media 128K off
backup/photos 128K off
backup/Projects 128K off
backup/Temp 128K off
backup/Video 128K off
backup/Zip 128K off
磁盘读取测试。单个文件读取:
#>dd if=MasterDynamic_Spray_F1332.tpc of=/dev/null
9708959+1 records in
9708959+1 records out
4970987388 bytes (5.0 GB, 4.6 GiB) copied, 77.755 s, 63.9 MB/s
当上述 dd 测试运行时,我拉了一个 zpool iostat:
#>zpool iostat -v 5
capacity operations bandwidth
pool alloc free read write read write
---------------------------------------------- ----- ----- ----- ----- ----- -----
backup 28.9T 9.07T 515 0 64.0M 0
raidz2 28.9T 9.07T 515 0 64.0M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHT17HA - - 413 0 4.62M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHSRD6A - - 429 0 4.60M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHRZWYA - - 431 0 4.59M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHSRS2A - - 430 0 4.59M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHR2DPA - - 432 0 4.60M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHN0P0A - - 427 0 4.60M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHU34LA - - 405 0 4.65M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHRHUEA - - 430 0 4.58M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHM0HBA - - 431 0 4.58M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHJG4LA - - 427 0 4.60M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHST58A - - 429 0 4.59M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHS0G5A - - 428 0 4.59M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHN2D4A - - 427 0 4.60M 0
ata-Hitachi_HUA723030ALA640_MK0371YVHR2G5A - - 428 0 4.59M 0
---------------------------------------------- ----- ----- ----- ----- ----- -----
答案1
输出ifconfig
显示,传输字节在所有四个接口上均匀平衡,因此从这个意义上来说它是有效的。
根据iostat
输出,这对我来说似乎是磁盘 IOPS(每秒 I/O)瓶颈。每个磁盘平均执行大约 400-500 IOPS,大小为 12-16kB。如果这些 I/O 不是连续的,那么您可能已经达到了驱动器的随机 I/O 限制。在传统的旋转磁盘上,这是由于旋转速度和移动读取头所花费的时间共同造成的——这些磁盘上的纯随机工作负载最高可达 100 IOPS。
ZFS 处理条带化的方式使情况变得更糟。与传统的 RAID-5 或 RAID-6 不同,ZFS 等效物 raidz 和 raidz2 强制驱动器步调一致。实际上,即使池中有 14 个驱动器,您也只能获得一个驱动器的随机 IOPS。
您应该再次测试以隔离磁盘性能。要么单独进行读取(例如同时进行几个读取:)dd if=bigfile of=/dev/null
要么尝试纯网络负载测试,例如 iPerf。
答案2
模式 0(循环)、模式 3(广播)、模式 5(balance-tlb)和模式 6(balance-alb)对于 Samba、CIFS、NFS、ISCSI 等 TCP 流来说都是糟糕的绑定模式,因为这些模式不能保证按顺序传输流量,而这正是 TCP 所依赖的,以避免 TCP 拥塞控制。
对于您需要的单个 TCP 流,您应该使用模式 1 (active-backup)、模式 2 (balance-xor) 或模式 4 (802.3ad)。您将被限制在单个从属设备的速度。没有好的方法可以在多个物理接口之间平衡单个大型 TCP 流。
如果您需要比一个从属速度更快,请获取更快的网络基础设施。