在我的主机硬件上我有 1G 的速度
在我使用 kvm 创建的虚拟机上,它下降到大约 20MB
我的主机运行的是 ubuntu 22.04 LTS
我该如何优化它?
我正在使用基于文件的虚拟机。我创建了 raw 和 qcow2 类型的磁盘,我看到的唯一区别是指定时创建文件磁盘。
我尝试通过 virt-manager 在磁盘上设置 nocache
我还检查了缓存模式none/writeback
对速度没有影响
以下是我进行的一些进一步的测试:
单次 4KiB 随机写入过程:最糟糕的测试
主机硬件
$ fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [w(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=114493: Tue Jan 24 12:42:44 2023
write: IOPS=10.1k, BW=39.5MiB/s (41.5MB/s)(4096MiB/103604msec); 0 zone resets
slat (nsec): min=1920, max=587633, avg=3761.73, stdev=3026.96
clat (usec): min=11, max=2551.6k, avg=26.49, stdev=2593.73
lat (usec): min=13, max=2551.7k, avg=30.25, stdev=2593.74
clat percentiles (usec):
| 1.00th=[ 20], 5.00th=[ 22], 10.00th=[ 22], 20.00th=[ 22],
| 30.00th=[ 22], 40.00th=[ 23], 50.00th=[ 23], 60.00th=[ 23],
| 70.00th=[ 23], 80.00th=[ 24], 90.00th=[ 25], 95.00th=[ 26],
| 99.00th=[ 32], 99.50th=[ 34], 99.90th=[ 44], 99.95th=[ 165],
| 99.99th=[ 545]
bw ( KiB/s): min=24864, max=152592, per=100.00%, avg=135295.44, stdev=25421.57, samples=62
iops : min= 6216, max=38148, avg=33823.85, stdev=6355.39, samples=62
lat (usec) : 20=1.13%, 50=98.80%, 100=0.01%, 250=0.05%, 500=0.01%
lat (usec) : 750=0.02%
lat (msec) : 2=0.01%, 500=0.01%, 750=0.01%, >=2000=0.01%
cpu : usr=5.71%, sys=7.64%, ctx=1063940, majf=0, minf=366
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,1048577,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=39.5MiB/s (41.5MB/s), 39.5MiB/s-39.5MiB/s (41.5MB/s-41.5MB/s), io=4096MiB (4295MB), run=103604-103604msec
Disk stats (read/write):
dm-0: ios=0/240696, merge=0/0, ticks=0/16578288, in_queue=16578288, util=85.10%, aggrios=0/242596, aggrmerge=0/3006, aggrticks=0/20300771, aggrin_queue=20300770, aggrutil=89.20%
sda: ios=0/242596, merge=0/3006, ticks=0/20300771, in_queue=20300770, util=89.20%
$ fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=114600: Tue Jan 24 12:45:29 2023
write: IOPS=11.2k, BW=43.7MiB/s (45.8MB/s)(4096MiB/93810msec); 0 zone resets
slat (nsec): min=1800, max=637861, avg=3705.65, stdev=2443.65
clat (usec): min=10, max=582234, avg=22.74, stdev=706.46
lat (usec): min=12, max=582238, avg=26.45, stdev=706.47
clat percentiles (usec):
| 1.00th=[ 17], 5.00th=[ 20], 10.00th=[ 21], 20.00th=[ 21],
| 30.00th=[ 21], 40.00th=[ 21], 50.00th=[ 22], 60.00th=[ 22],
| 70.00th=[ 22], 80.00th=[ 22], 90.00th=[ 24], 95.00th=[ 25],
| 99.00th=[ 31], 99.50th=[ 33], 99.90th=[ 44], 99.95th=[ 151],
| 99.99th=[ 537]
bw ( KiB/s): min=44784, max=185360, per=100.00%, avg=147168.42, stdev=18660.88, samples=57
iops : min=11196, max=46340, avg=36792.07, stdev=4665.22, samples=57
lat (usec) : 20=6.13%, 50=93.79%, 100=0.01%, 250=0.05%, 500=0.01%
lat (usec) : 750=0.02%
lat (msec) : 2=0.01%, 250=0.01%, 500=0.01%, 750=0.01%
cpu : usr=6.33%, sys=7.47%, ctx=1079749, majf=0, minf=327
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,1048577,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=43.7MiB/s (45.8MB/s), 43.7MiB/s-43.7MiB/s (45.8MB/s-45.8MB/s), io=4096MiB (4295MB), run=93810-93810msec
Disk stats (read/write):
dm-0: ios=0/257987, merge=0/0, ticks=0/14471372, in_queue=14471372, util=80.94%, aggrios=0/259380, aggrmerge=0/3269, aggrticks=0/20576252, aggrin_queue=20576252, aggrutil=88.06%
sda: ios=0/259380, merge=0/3269, ticks=0/20576252, in_queue=20576252, util=88.06%
$ fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
Jobs: 1 (f=1): [w(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=114700: Tue Jan 24 12:48:03 2023
write: IOPS=10.5k, BW=41.0MiB/s (43.0MB/s)(4096MiB/99783msec); 0 zone resets
slat (nsec): min=1931, max=543062, avg=3706.35, stdev=3369.72
clat (usec): min=11, max=659263, avg=22.63, stdev=643.97
lat (usec): min=14, max=659267, avg=26.33, stdev=643.98
clat percentiles (usec):
| 1.00th=[ 19], 5.00th=[ 21], 10.00th=[ 21], 20.00th=[ 21],
| 30.00th=[ 22], 40.00th=[ 22], 50.00th=[ 22], 60.00th=[ 22],
| 70.00th=[ 22], 80.00th=[ 23], 90.00th=[ 24], 95.00th=[ 25],
| 99.00th=[ 29], 99.50th=[ 33], 99.90th=[ 43], 99.95th=[ 139],
| 99.99th=[ 537]
bw ( KiB/s): min= 5648, max=166179, per=100.00%, avg=144625.43, stdev=22760.25, samples=58
iops : min= 1412, max=41544, avg=36156.28, stdev=5690.11, samples=58
lat (usec) : 20=3.87%, 50=96.05%, 100=0.01%, 250=0.05%, 500=0.01%
lat (usec) : 750=0.02%, 1000=0.01%
lat (msec) : 20=0.01%, 750=0.01%
cpu : usr=5.86%, sys=7.61%, ctx=1080511, majf=0, minf=359
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,1048577,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=41.0MiB/s (43.0MB/s), 41.0MiB/s-41.0MiB/s (43.0MB/s-43.0MB/s), io=4096MiB (4295MB), run=99783-99783msec
Disk stats (read/write):
dm-0: ios=0/245070, merge=0/0, ticks=0/17235960, in_queue=17235960, util=83.79%, aggrios=0/246419, aggrmerge=0/3660, aggrticks=0/22057670, aggrin_queue=22057670, aggrutil=88.55%
sda: ios=0/246419, merge=0/3660, ticks=0/22057670, in_queue=22057670, util=88.55%
此测试在运行 openstack(控制器 2)的虚拟机上进行,其中 openstack 中有 3 个单个裸虚拟机,没有在 kvm 上运行任何应用程序
$ fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [F(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=451129: Tue Jan 24 13:04:09 2023
write: IOPS=250, BW=1001KiB/s (1026kB/s)(826MiB/844616msec); 0 zone resets
slat (nsec): min=604, max=487941, avg=3069.50, stdev=3227.61
clat (usec): min=2, max=116745k, avg=576.78, stdev=253872.83
lat (usec): min=9, max=116745k, avg=579.85, stdev=253872.85
clat percentiles (usec):
| 1.00th=[ 11], 5.00th=[ 13], 10.00th=[ 14], 20.00th=[ 15],
| 30.00th=[ 15], 40.00th=[ 19], 50.00th=[ 22], 60.00th=[ 24],
| 70.00th=[ 26], 80.00th=[ 31], 90.00th=[ 40], 95.00th=[ 49],
| 99.00th=[ 76], 99.50th=[ 91], 99.90th=[ 359], 99.95th=[ 685],
| 99.99th=[ 873]
bw ( KiB/s): min=13680, max=195824, per=100.00%, avg=130092.46, stdev=52846.56, samples=13
iops : min= 3420, max=48956, avg=32523.08, stdev=13211.60, samples=13
lat (usec) : 4=0.01%, 10=0.96%, 20=46.60%, 50=48.11%, 100=3.99%
lat (usec) : 250=0.23%, 500=0.03%, 750=0.06%, 1000=0.02%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.01%, >=2000=0.01%
cpu : usr=0.10%, sys=0.13%, ctx=264372, majf=0, minf=29
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,211466,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=1001KiB/s (1026kB/s), 1001KiB/s-1001KiB/s (1026kB/s-1026kB/s), io=826MiB (866MB), run=844616-844616msec
Disk stats (read/write):
dm-0: ios=232/163901, merge=0/0, ticks=144/7660152, in_queue=7660296, util=17.91%, aggrios=221/160213, aggrmerge=11/3722, aggrticks=159/1113901, aggrin_queue=1983749, aggrutil=43.00%
vda: ios=221/160213, merge=11/3722, ticks=159/1113901, in_queue=1983749, util=43.00%
$ fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
Jobs: 1 (f=1): [F(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=452551: Tue Jan 24 13:25:06 2023
write: IOPS=286, BW=1145KiB/s (1172kB/s)(973MiB/869962msec); 0 zone resets
slat (nsec): min=1014, max=520262, avg=3532.80, stdev=4003.56
clat (nsec): min=910, max=57218M, avg=259432.63, stdev=114674189.43
lat (usec): min=13, max=57218k, avg=262.97, stdev=114674.22
clat percentiles (usec):
| 1.00th=[ 14], 5.00th=[ 16], 10.00th=[ 18], 20.00th=[ 19],
| 30.00th=[ 21], 40.00th=[ 22], 50.00th=[ 23], 60.00th=[ 24],
| 70.00th=[ 27], 80.00th=[ 29], 90.00th=[ 34], 95.00th=[ 42],
| 99.00th=[ 70], 99.50th=[ 77], 99.90th=[ 172], 99.95th=[ 502],
| 99.99th=[22676]
bw ( KiB/s): min= 5336, max=161784, per=100.00%, avg=110630.83, stdev=54549.81, samples=18
iops : min= 1334, max=40446, avg=27657.67, stdev=13637.43, samples=18
lat (nsec) : 1000=0.01%
lat (usec) : 2=0.01%, 4=0.01%, 20=28.78%, 50=68.68%, 100=2.30%
lat (usec) : 250=0.17%, 500=0.02%, 750=0.02%, 1000=0.01%
lat (msec) : 2=0.01%, 4=0.01%, 20=0.01%, 50=0.01%, >=2000=0.01%
cpu : usr=0.13%, sys=0.17%, ctx=260439, majf=0, minf=30
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,248968,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=1145KiB/s (1172kB/s), 1145KiB/s-1145KiB/s (1172kB/s-1172kB/s), io=973MiB (1020MB), run=869962-869962msec
Disk stats (read/write):
dm-0: ios=124/189939, merge=0/0, ticks=64/6847936, in_queue=6848000, util=72.81%, aggrios=79/179513, aggrmerge=45/10455, aggrticks=26/1126630, aggrin_queue=2028077, aggrutil=90.71%
vda: ios=79/179513, merge=45/10455, ticks=26/1126630, in_queue=2028077, util=90.71%
您可以从中看到它从 43MB/s 降到了 1MB/s。这是一个大问题
此测试在 Openstack VM 控制器 2 上进行,但虚拟化软件是 ESXi
$ fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
random-write: Laying out IO file (1 file / 4096MiB)
Jobs: 1 (f=1): [F(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=530128: Tue Jan 24 13:18:47 2023
write: IOPS=3149, BW=12.3MiB/s (12.9MB/s)(1722MiB/139918msec); 0 zone resets
slat (nsec): min=1385, max=749909, avg=11219.59, stdev=9674.52
clat (nsec): min=610, max=149012k, avg=122940.18, stdev=866525.51
lat (usec): min=35, max=149020, avg=134.16, stdev=866.28
clat percentiles (usec):
| 1.00th=[ 35], 5.00th=[ 35], 10.00th=[ 46], 20.00th=[ 51],
| 30.00th=[ 60], 40.00th=[ 63], 50.00th=[ 64], 60.00th=[ 68],
| 70.00th=[ 70], 80.00th=[ 72], 90.00th=[ 79], 95.00th=[ 89],
| 99.00th=[ 221], 99.50th=[ 1467], 99.90th=[13829], 99.95th=[16188],
| 99.99th=[19530]
bw ( KiB/s): min= 9672, max=99544, per=100.00%, avg=29553.08, stdev=21110.49, samples=119
iops : min= 2418, max=24886, avg=7388.23, stdev=5277.64, samples=119
lat (nsec) : 750=0.01%, 1000=0.01%
lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 50=17.22%, 100=79.51%
lat (usec) : 250=2.37%, 500=0.14%, 750=0.08%, 1000=0.06%
lat (msec) : 2=0.12%, 4=0.01%, 10=0.30%, 20=0.18%, 50=0.01%
lat (msec) : 100=0.01%, 250=0.01%
cpu : usr=3.14%, sys=6.60%, ctx=564104, majf=0, minf=30
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,440722,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=12.3MiB/s (12.9MB/s), 12.3MiB/s-12.3MiB/s (12.9MB/s-12.9MB/s), io=1722MiB (1805MB), run=139918-139918msec
Disk stats (read/write):
dm-0: ios=0/240336, merge=0/0, ticks=0/3124100, in_queue=3124100, util=91.31%, aggrios=0/235436, aggrmerge=0/5071, aggrticks=0/2887407, aggrin_queue=2887407, aggrutil=92.02%
sda: ios=0/235436, merge=0/5071, ticks=0/2887407, in_queue=2887407, util=92.02%
$ fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
Jobs: 1 (f=1): [F(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=530294: Tue Jan 24 13:21:08 2023
write: IOPS=6080, BW=23.8MiB/s (24.9MB/s)(2393MiB/100740msec); 0 zone resets
slat (nsec): min=1367, max=1029.8k, avg=11761.38, stdev=10525.79
clat (nsec): min=915, max=62359k, avg=82333.89, stdev=390799.49
lat (usec): min=35, max=62382, avg=94.10, stdev=391.00
clat percentiles (usec):
| 1.00th=[ 36], 5.00th=[ 37], 10.00th=[ 47], 20.00th=[ 59],
| 30.00th=[ 65], 40.00th=[ 67], 50.00th=[ 69], 60.00th=[ 71],
| 70.00th=[ 72], 80.00th=[ 74], 90.00th=[ 82], 95.00th=[ 98],
| 99.00th=[ 192], 99.50th=[ 253], 99.90th=[ 8356], 99.95th=[ 9372],
| 99.99th=[16057]
bw ( KiB/s): min=23136, max=95208, per=100.00%, avg=41702.67, stdev=13481.11, samples=117
iops : min= 5784, max=23802, avg=10425.62, stdev=3370.29, samples=117
lat (nsec) : 1000=0.01%
lat (usec) : 2=0.01%, 4=0.01%, 50=13.24%, 100=82.03%, 250=4.21%
lat (usec) : 500=0.22%, 750=0.10%, 1000=0.02%
lat (msec) : 2=0.06%, 4=0.01%, 10=0.06%, 20=0.05%, 50=0.01%
lat (msec) : 100=0.01%
cpu : usr=6.24%, sys=13.79%, ctx=755651, majf=0, minf=29
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,612557,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=23.8MiB/s (24.9MB/s), 23.8MiB/s-23.8MiB/s (24.9MB/s-24.9MB/s), io=2393MiB (2509MB), run=100740-100740msec
Disk stats (read/write):
dm-0: ios=0/353311, merge=0/0, ticks=0/2510080, in_queue=2510080, util=93.10%, aggrios=0/325545, aggrmerge=0/28769, aggrticks=0/2168746, aggrin_queue=2168746, aggrutil=93.35%
sda: ios=0/325545, merge=0/28769, ticks=0/2168746, in_queue=2168746, util=93.35%
$ fio --name=random-write --ioengine=posixaio --rw=randwrite --bs=4k --size=4g --numjobs=1 --iodepth=1 --runtime=60 --time_based --end_fsync=1
random-write: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=posixaio, iodepth=1
fio-3.28
Starting 1 process
Jobs: 1 (f=1): [F(1)][100.0%][eta 00m:00s]
random-write: (groupid=0, jobs=1): err= 0: pid=530405: Tue Jan 24 13:23:08 2023
write: IOPS=5930, BW=23.2MiB/s (24.3MB/s)(2308MiB/99631msec); 0 zone resets
slat (nsec): min=1378, max=1395.4k, avg=12724.69, stdev=10859.25
clat (nsec): min=797, max=22413k, avg=83620.52, stdev=356081.74
lat (usec): min=35, max=22415, avg=96.35, stdev=356.19
clat percentiles (usec):
| 1.00th=[ 36], 5.00th=[ 48], 10.00th=[ 57], 20.00th=[ 65],
| 30.00th=[ 69], 40.00th=[ 71], 50.00th=[ 71], 60.00th=[ 72],
| 70.00th=[ 73], 80.00th=[ 76], 90.00th=[ 81], 95.00th=[ 93],
| 99.00th=[ 184], 99.50th=[ 219], 99.90th=[ 8291], 99.95th=[10290],
| 99.99th=[14091]
bw ( KiB/s): min=26568, max=100256, per=100.00%, avg=40559.51, stdev=9507.31, samples=116
iops : min= 6642, max=25064, avg=10139.87, stdev=2376.85, samples=116
lat (nsec) : 1000=0.01%
lat (usec) : 2=0.01%, 4=0.01%, 50=6.37%, 100=89.89%, 250=3.36%
lat (usec) : 500=0.15%, 750=0.09%, 1000=0.02%
lat (msec) : 2=0.01%, 4=0.01%, 10=0.04%, 20=0.06%, 50=0.01%
cpu : usr=6.64%, sys=14.57%, ctx=711625, majf=0, minf=28
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,590890,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
Run status group 0 (all jobs):
WRITE: bw=23.2MiB/s (24.3MB/s), 23.2MiB/s-23.2MiB/s (24.3MB/s-24.3MB/s), io=2308MiB (2420MB), run=99631-99631msec
Disk stats (read/write):
dm-0: ios=0/302542, merge=0/0, ticks=0/2060836, in_queue=2060836, util=83.71%, aggrios=0/302903, aggrmerge=0/388, aggrticks=0/1961686, aggrin_queue=1961686, aggrutil=83.91%
sda: ios=0/302903, merge=0/388, ticks=0/1961686, in_queue=1961686, util=83.91%
我有三星 SSD 870 QVO 2TB,总共 4TB 运行 raid 0
这是我的 kvm xml
答案1
我相信三星 870 是一款消费级驱动器,它的性能会下降,并且很有可能出现相关故障,尤其是在最有可能运行 Ceph 的多节点集群中。
以下型号的 2 TB 版本(7.6 TB 版本)将是更好的选择:SAMSUNG MZ7LH7T6HMLA-00005
请特别注意扇区对齐,大多数操作系统在 1 MiB 边界上创建分区,并从扇区 2048 上启动第一个分区(考虑模拟的 512 字节扇区大小)。
在下面的例子中,我将显示单位切换为扇区。打印输出还显示模拟(逻辑)扇区大小为每扇区 512 字节,而驱动器将数据分布在 4 KiB 页(物理扇区大小)中。Parted 还有一个内置命令来检查分区的扇区对齐情况:
[root@kvm1a ~]# parted /dev/sda
GNU Parted 3.4
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) unit s
(parted) p
Model: ATA SAMSUNG MZ7LH7T6 (scsi)
Disk /dev/sda: 15002931888s
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 2048s 4095s 2048s bbp bios_grub
2 4096s 62918655s 62914560s non-fs raid
3 62918656s 65015807s 2097152s non-fs raid
4 65015808s 65220607s 204800s xfs ceph data
5 65220608s 15002929151s 14937708544s ceph block
(parted) align-check optimal 1
1 aligned
(parted) align-check optimal 2
2 aligned
(parted) align-check optimal 3
3 aligned
(parted) align-check optimal 4
4 aligned
(parted) align-check optimal 5
5 aligned
分区应该从扇区 2048 开始,这样可以得到一个干净的 1 MiB 起始边界,即 2048 x 512(扇区大小)= 1048576(1 MiB)。许多人认为这会浪费空间,并尝试从扇区 1 开始创建分区。然而,这会导致问题,因为第一个可寻址扇区实际上是 0,而不是 1。扇区 0 是为 MBR/GPT 分区表和引导跳转代码保留的。
如果有人发现这有用,这里有一个脚本,它验证计算节点上所有 Ceph RBD 映射映像上分区的起始扇区:
rbd showmapped | grep /dev/rbd | awk '{print $3" "$5}' | while read disk dev; do
parted --script $dev 'unit s p'| grep -P '^\s+\d' | while read partition start info; do
num=${start::-1};
if [ $num != $((num/2048*2048)) ]; then
[ `echo $info | grep -c 'Microsoft reserved partition'` -lt 1 ] && \
[ `grep -Pc "\s131072\s+${dev#/dev/}$" /proc/partitions` -lt 1 ] && \
echo "$disk mounted as $dev has problem with partition $partition";
fi
done
done
# 2048 comes from 1024*1024/512 = 2048
# excludes spacer partitions created by Windows
# excludes MikroTik CHR disks of 128 MiB
答案2
使用 KVM 我们发现,当 qemu 配置为使用带有写回缓存的 vioscsi 时,运行多个 VM 的主机将提供最佳性能。
由于系统不将奖励复制到缓存,一些基准测试显示禁用缓存后读取性能更高,但这极大地有利于读取并减轻了 Ceph/iSCSI 存储的压力。
PS:写回模式具有刷新感知功能,因此它像任何其他表现良好的硬件 RAID 控制器一样工作,并且在事务上是安全的。