NVME 与 SSD:读/写性能问题

NVME 与 SSD:读/写性能问题

我们从 Hetzner.de 租了一些服务器。其中一些服务器配有 NVMe 硬盘,而其他服务器配有 SSD。我们使用以下命令对 4 台服务器的读写性能进行了基准测试:

fio
dd
hdparm

操作系统是CentOS7,每台服务器都有两个硬盘,配有软件Raid 1。所有服务器都位于Hetzner数据中心。

磁盘品牌:

SSD:
    Model Family:     Samsung based SSDs
    Device Model:     SAMSUNG MZ7LM240HCGR-00003
NVMe:
    Model Number:                       THNSN5512GPU7 TOSHIBA
    Serial Number:                      Z62S101OTUHV

基准测试结果如下:

Server1(NVMe):

Base Board Information
        Manufacturer: FUJITSU
        Product Name: D3417-B1
        Version: S26361-D3417-B1

# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75

Run status group 0 (all jobs):
   READ: bw=45.9MiB/s (48.1MB/s), 45.9MiB/s-45.9MiB/s (48.1MB/s-48.1MB/s), io=3070MiB (3219MB), run=66884-66884msec
  WRITE: bw=15.3MiB/s (16.1MB/s), 15.3MiB/s-15.3MiB/s (16.1MB/s-16.1MB/s), io=1026MiB (1076MB), run=66884-66884msec

Disk stats (read/write):
    md127: ios=785293/276106, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=393078/273876, aggrmerge=3/9643, aggrticks=330689/2134457, aggrin_queue=2467357, aggrutil=63.84%
  nvme0n1: ios=410663/273879, merge=7/9640, ticks=257384/2054071, in_queue=2311731, util=55.06%
  nvme1n1: ios=375494/273874, merge=0/9647, ticks=403994/2214844, in_queue=2622983, util=63.84%

#dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=dsync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 14.2603 s, 75.3 MB/s

# hdparm -Tt /dev/nvme0n1

/dev/nvme0n1:
 Timing cached reads:   29320 MB in  1.98 seconds = 14818.11 MB/sec
 Timing buffered disk reads: 2660 MB in  3.00 seconds = 886.22 MB/sec
------------------------------------------------------------
------------------------------------------------------------
Server2(NVMe):

Base Board Information
        Manufacturer: FUJITSU
        Product Name: D3417-B1
        Version: S26361-D3417-B1

# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75

Run status group 0 (all jobs):
   READ: io=3072.2MB, aggrb=40296KB/s, minb=40296KB/s, maxb=40296KB/s, mint=78069msec, maxt=78069msec
  WRITE: io=1023.9MB, aggrb=13429KB/s, minb=13429KB/s, maxb=13429KB/s, mint=78069msec, maxt=78069msec

Disk stats (read/write):
    md1: ios=786339/298554, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=393673/300844, aggrmerge=0/0, aggrticks=543418/2294840, aggrin_queue=2838462, aggrutil=65.25%
  nvme0n1: ios=180052/300844, merge=0/0, ticks=480768/1879827, in_queue=2360788, util=56.22%
  nvme1n1: ios=607294/300844, merge=0/0, ticks=606068/2709853, in_queue=3316136, util=65.25%

#dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=dsync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 33.2734 s, 32.3 MB/s

# hdparm -Tt /dev/nvme0n1

/dev/nvme0n1:
 Timing cached reads:   33788 MB in  1.99 seconds = 16977.90 MB/sec
 Timing buffered disk reads: 2204 MB in  3.00 seconds = 734.34 MB/sec

------------------------------------------------------------
------------------------------------------------------------
Server3(SSD)
Base Board Information
        Manufacturer: ASUSTeK COMPUTER INC.
        Product Name: Z10PA-U8 Series
        Version: Rev 1.xx

# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75

Run status group 0 (all jobs):
   READ: bw=262MiB/s (275MB/s), 262MiB/s-262MiB/s (275MB/s-275MB/s), io=3070MiB (3219MB), run=11718-11718msec
  WRITE: bw=87.6MiB/s (91.8MB/s), 87.6MiB/s-87.6MiB/s (91.8MB/s-91.8MB/s), io=1026MiB (1076MB), run=11718-11718msec

Disk stats (read/write):
    md2: ios=769518/258504, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=392958/263227, aggrmerge=9/864, aggrticks=219931/33550, aggrin_queue=253441, aggrutil=99.06%
  sda: ios=402306/263220, merge=12/871, ticks=222960/35975, in_queue=258904, util=99.04%
  sdb: ios=383611/263234, merge=7/857, ticks=216902/31125, in_queue=247978, util=99.06%

#dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=dsync
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 5.19855 s, 207 MB/s

# hdparm -Tt /dev/sda

/dev/sda:
 Timing cached reads:   22452 MB in  1.99 seconds = 11258.90 MB/sec
 Timing buffered disk reads: 1546 MB in  3.00 seconds = 514.90 MB/sec

------------------------------------------------------------
------------------------------------------------------------
Server4(SSD)
Base Board Information
        Manufacturer: FUJITSU
        Product Name: D3401-H2
        Version: S26361-D3401-H2

# fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randrw --rwmixread=75

Run status group 0 (all jobs):
   READ: io=3073.6MB, aggrb=61065KB/s, minb=61065KB/s, maxb=61065KB/s, mint=51539msec, maxt=51539msec
  WRITE: io=1022.5MB, aggrb=20315KB/s, minb=20315KB/s, maxb=20315KB/s, mint=51539msec, maxt=51539msec

Disk stats (read/write):
    md2: ios=784514/278548, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=392439/266239, aggrmerge=1246/13570, aggrticks=822829/716748, aggrin_queue=1539532, aggrutil=91.82%
  sda: ios=421561/266337, merge=1030/13473, ticks=867321/639461, in_queue=1506738, util=91.82%
  sdb: ios=363317/266142, merge=1463/13667, ticks=778338/794035, in_queue=1572326, util=91.76%

# dd if=/dev/zero of=/root/testfile bs=1G count=1 oflag=dsync
1073741824 bytes (1.1 GB) copied, 10.6605 s, 101 MB/s

# hdparm -Tt /dev/sda

/dev/sda:
 Timing cached reads:   33686 MB in  1.98 seconds = 16985.97 MB/sec
 Timing buffered disk reads: 1304 MB in  3.00 seconds = 434.34 MB/sec

从结果可以看出,fio 命令显示 server1(NVMe) 的读/写结果令人失望。但与 SSD 相比,它显示 server2(NVMe) 的结果更好。与 SSD 相比,dd 命令显示两个 NVMe 服务器的读/写结果都令人失望。hdparm 命令还显示所有服务器的读/写性能结果几乎相同。

所有测试均在非高峰时段进行,服务器的平均负载为 0.0。

我们在 NVMe 服务器中面临的另一个奇怪问题是,在恢复帐户备份甚至恢复 zip 文件时,I/O 负载很高。例如,如果我恢复一个大小为 150MB 的 zip 文件,则在完全解压后,平均服务器负载会超过 20,这直接归因于 I/O 等待(来自“top”命令)。

我们想知道是什么原因导致 NVMe 与 SSD 相比性能如此令人失望?创建软件或硬件 raid 是否会影响 NVMe 性能,导致读/写性能比 SSD 更差?如果是,那么为什么 SSD 在软件或硬件 raid 下工作得几乎很好?

相关内容