我们刚刚收到两台全新的 Supermicro 服务器1028U-TN10RT+配备 10 个 NVMe 插槽,其中两个英特尔 DC P3600800GB 硬盘。
我们迫切希望测试这些驱动器的性能,因为规格承诺提供非常好的读取(高达 2.6gb/s)和写入(高达 1gb/s)性能。我们将这两个驱动器置于软件 RAID 1 配置中,因为这是我们想要在生产中使用的配置。我们使用 FIO 进行了测试,结果有些令人困惑。
完整结果如下,但概括如下:RAID1 阵列中的两个驱动器实现约 550MB/s 的随机写入速度(这是较好的运行之一),其中单个驱动器(无 RAID)的写入速度可达约 920MB/s。
使用软件 RAID 的开销是否太高?我们还能进行其他调整吗?
系统有 128GB RAM 并且运行 CentOS 7.1,内核版本升级到 4.2.4。
fio --name=randwrite --ioengine=libaio --iodepth=64 --rw=randwrite \
--bs=64k --direct=1 --size=32G --numjobs=8 --runtime=240 \
--group_reporting
直接安装的单个驱动器、xfs 文件系统上的结果:
randwrite: (groupid=0, jobs=8): err= 0: pid=9307: Tue Oct 27 14:36:35 2015
write: io=217971MB, bw=929843KB/s, iops=14528, runt=240043msec
slat (usec): min=5, max=933, avg=24.10, stdev= 9.29
clat (usec): min=32, max=135283, avg=35212.65, stdev=27746.71
lat (usec): min=49, max=135300, avg=35237.02, stdev=27746.76
clat percentiles (usec):
| 1.00th=[ 215], 5.00th=[ 2224], 10.00th=[ 5600], 20.00th=[12992],
| 30.00th=[16768], 40.00th=[19328], 50.00th=[23168], 60.00th=[33536],
| 70.00th=[47872], 80.00th=[63232], 90.00th=[79360], 95.00th=[88576],
| 99.00th=[102912], 99.50th=[107008], 99.90th=[116224], 99.95th=[119296],
| 99.99th=[125440]
bw (KB /s): min=42411, max=298624, per=12.51%, avg=116326.24, stdev=24050.53
lat (usec) : 50=0.01%, 100=0.27%, 250=0.87%, 500=0.77%, 750=0.55%
lat (usec) : 1000=0.47%
lat (msec) : 2=1.67%, 4=3.43%, 10=7.17%, 20=27.37%, 50=28.86%
lat (msec) : 100=26.99%, 250=1.55%
cpu : usr=1.75%, sys=4.98%, ctx=3056950, majf=0, minf=56673
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=0/w=3487535/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
WRITE: io=217971MB, aggrb=929842KB/s, minb=929842KB/s, maxb=929842KB/s, mint=240043msec, maxt=240043msec
Disk stats (read/write):
nvme2n1: ios=0/4691372, merge=0/0, ticks=0/154695600, in_queue=155446639, util=100.00%
使用 md、RAID 1 时的结果:
randwrite: (groupid=0, jobs=8): err= 0: pid=8553: Tue Oct 27 14:32:03 2015
write: io=130141MB, bw=555110KB/s, iops=8673, runt=240069msec
slat (usec): min=20, max=349051, avg=130.51, stdev=2000.03
clat (usec): min=59, max=912669, avg=58782.87, stdev=50750.42
lat (usec): min=95, max=927440, avg=58913.81, stdev=51010.14
clat percentiles (usec):
| 1.00th=[ 668], 5.00th=[ 3472], 10.00th=[ 8512], 20.00th=[21888],
| 30.00th=[32640], 40.00th=[41728], 50.00th=[48896], 60.00th=[58112],
| 70.00th=[71168], 80.00th=[86528], 90.00th=[114176], 95.00th=[142336],
| 99.00th=[216064], 99.50th=[250880], 99.90th=[577536], 99.95th=[716800],
| 99.99th=[872448]
bw (KB /s): min= 70, max=175104, per=12.56%, avg=69708.68, stdev=20589.85
lat (usec) : 100=0.02%, 250=0.29%, 500=0.43%, 750=0.38%, 1000=0.36%
lat (msec) : 2=1.22%, 4=2.98%, 10=5.56%, 20=7.47%, 50=32.45%
lat (msec) : 100=34.50%, 250=13.81%, 500=0.39%, 750=0.08%, 1000=0.05%
cpu : usr=1.28%, sys=6.46%, ctx=1727469, majf=0, minf=69488
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=0/w=2082262/d=0, short=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
WRITE: io=130141MB, aggrb=555110KB/s, minb=555110KB/s, maxb=555110KB/s, mint=240069msec, maxt=240069msec
Disk stats (read/write):
md0: ios=0/2615652, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=11136/2630386, aggrmerge=0/0, aggrticks=10763/72152582, aggrin_queue=72527830, aggrutil=99.40%
nvme0n1: ios=22273/2619265, merge=0/0, ticks=21526/14920779, in_queue=14979917, util=49.15%
nvme1n1: ios=0/2641508, merge=0/0, ticks=0/129384385, in_queue=130075743, util=99.40%
mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Tue Oct 27 13:12:34 2015
Raid Level : raid1
Array Size : 781278208 (745.08 GiB 800.03 GB)
Used Dev Size : 781278208 (745.08 GiB 800.03 GB)
Raid Devices : 2
Total Devices : 2
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Tue Oct 27 14:54:24 2015
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Name : localhost.localdomain:0 (local to host localhost.localdomain)
UUID : cf2ce291:0c52f361:bc40dffa:918595d9
Events : 706
Number Major Minor RaidDevice State
0 259 3 0 active sync /dev/nvme0n1p1
1 259 1 1 active sync /dev/nvme1n1p1
答案1
这可能是使用的内部写入意图位图的副作用。使用mdadm <dev> --grow --bitmap=none
将其删除,然后重试fio
。
无论如何,我强烈建议你反对在没有启用位图的阵列的情况下进入生产阶段,因为崩溃/断电将迫使阵列进行完整的逐字节扫描/比较。写入意图位图将保证很多更快的恢复。