在具有 4 个 SSD 的 CentOS 7 服务器上,我用 创建了两个 RAID 0 阵列mdadm
。两者都使用 ext4 格式化并安装在不同的目录中。
我对它们进行了基准测试fio
,并获得了随机写入的以下结果:
randomwrites: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.2.8
Starting 1 process
Jobs: 1 (f=1)
randomwrites: (groupid=0, jobs=1): err= 0: pid=21814: Sat May 21 15:47:19 2016
write: io=1024.0MB, bw=696266KB/s, iops=174066, runt= 1506msec
cpu : usr=9.04%, sys=89.37%, ctx=3803, majf=0, minf=27
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
WRITE: io=1024.0MB, aggrb=696265KB/s, minb=696265KB/s, maxb=696265KB/s, mint=1506msec, maxt=1506msec
Disk stats (read/write):
dm-0: ios=0/243409, merge=0/0, ticks=0/81182, in_queue=81298, util=93.32%, aggrios=0/262144, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
md67: ios=0/262144, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=0/131068, aggrmerge=0/4, aggrticks=0/43552, aggrin_queue=43553, aggrutil=92.14%
sda: ios=0/130872, merge=0/2, ticks=0/43363, in_queue=43381, util=92.02%
sdb: ios=0/131264, merge=0/6, ticks=0/43741, in_queue=43725, util=92.14%
然后我使用这两个 RAID 0 阵列创建了一个 RAID 1 阵列并再次运行测试。
randomwrites: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=libaio, iodepth=64
fio-2.2.8
Starting 1 process
Jobs: 1 (f=1): [w(1)] [-.-% done] [0KB/473.1MB/0KB /s] [0/121K/0 iops] [eta 00m:00s]
randomwrites: (groupid=0, jobs=1): err= 0: pid=22598: Sat May 21 16:00:55 2016
write: io=1024.0MB, bw=482770KB/s, iops=120692, runt= 2172msec
cpu : usr=8.66%, sys=61.40%, ctx=73489, majf=0, minf=28
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
WRITE: io=1024.0MB, aggrb=482769KB/s, minb=482769KB/s, maxb=482769KB/s, mint=2172msec, maxt=2172msec
Disk stats (read/write):
dm-0: ios=0/259433, merge=0/0, ticks=0/134614, in_queue=135499, util=95.95%, aggrios=0/262144, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
md69: ios=0/262144, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=0/262145, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
md67: ios=0/262145, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=0/131075, aggrmerge=0/0, aggrticks=0/66948, aggrin_queue=66958, aggrutil=94.64%
sda: ios=0/130878, merge=0/0, ticks=0/66753, in_queue=66744, util=94.64%
sdb: ios=0/131273, merge=0/0, ticks=0/67143, in_queue=67172, util=94.59%
md68: ios=0/262145, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=0/131075, aggrmerge=0/0, aggrticks=0/68108, aggrin_queue=68114, aggrutil=94.68%
sdc: ios=0/130878, merge=0/0, ticks=0/67942, in_queue=67928, util=94.68%
sdd: ios=0/131273, merge=0/0, ticks=0/68274, in_queue=68300, util=94.68%
如您所见,RAID 0 阵列的执行时间为174066 IOPS而 RAID 1 仅比两个 RAID 0 提供120692 操作. 写入性能下降的原因是什么?
IO 调度程序设置为无操作适用于所有 4 个 SSD。
答案1
软件 RAID1 需要复制每个数据块,实际上会通过 SB 和 SATA 链路传输两次。这意味着,有时,由于总线拥塞,在使用高性能存储驱动程序(在您的例子中是 SSD)时,可以观察到 IOPS 显著下降。
尝试增加 I/O 队列长度和/或切换到截止期限调度程序,因为这两种变化都会增加 I/O 合并的机会,从而减少总线拥塞。