我一直在设置 raid1 阵列,并使用默认选项的 cryptsetup 设置加密介质。raid 阵列应该使用 2 个驱动器,但目前,我在每个 raid1 阵列中只有 1 个驱动器,以便比较它们之间的性能。
表现
未加密的数组
写入性能
dd if=/dev/zero of=/media/storage/Temp/test.img bs=100M count=10 10+0 records in 10+0 records out 1048576000 bytes (1.0 GB) copied, 7.35153 s, 143 MB/s
顶部输出:
top - 10:30:02 up 2 days, 19:18, 2 users, load average: 0.00, 0.16, 0.72
Tasks: 147 total, 3 running, 144 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 21.4 sy, 0.0 ni, 75.0 id, 0.9 wa, 0.0 hi, 2.7 si, 0.0 st
KiB Mem: 4044256 total, 1135880 used, 2908376 free, 224624 buffers
KiB Swap: 7812496 total, 123488 used, 7689008 free, 470796 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11591 root 20 0 109m 100m 572 R 98.5 2.5 0:03.12 dd
11592 root 20 0 0 0 0 R 98.5 0.0 0:00.24 flush-9:1
203 root 20 0 0 0 0 S 52.1 0.0 0:15.59 md1_raid1
这里的一切似乎都在意料之中
读取性能
hdparm -t /dev/md1
/dev/md1:
Timing buffered disk reads: 574 MB in 3.01 seconds = 190.95 MB/sec
加密数组
写入性能
dd if=/dev/zero of=/dev/mapper/galerkin_storage bs=100M count=100
100+0 records in
100+0 records out
10485760000 bytes (10 GB) copied, 209.058 s, 50.2 MB/s
顶部输出:
top - 10:12:20 up 2 days, 19:00, 2 users, load average: 5.65, 2.92, 1.60
Tasks: 149 total, 6 running, 143 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 21.4 sy, 0.0 ni, 74.9 id, 0.9 wa, 0.0 hi, 2.7 si, 0.0 st
KiB Mem: 4044256 total, 3749816 used, 294440 free, 3155712 buffers
KiB Swap: 7812496 total, 132464 used, 7680032 free, 40892 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10940 root 20 0 0 0 0 R 99.0 0.0 1:49.99 kworker/2:1
11538 root 20 0 0 0 0 R 94.5 0.0 1:28.32 kworker/3:1
11486 root 20 0 0 0 0 R 63.0 0.0 2:13.37 kworker/1:2
11489 root 20 0 0 0 0 R 27.0 0.0 0:52.80 flush-253:0
10910 root 20 0 0 0 0 R 22.5 0.0 2:06.59 kworker/0:2
1305 root 20 0 0 0 0 S 18.0 0.0 338:40.46 md3_raid1
11490 root 20 0 0 0 0 S 13.5 0.0 1:31.37 kworker/0:1
11539 root 20 0 109m 100m 572 D 13.5 2.5 0:23.25 dd
读取性能
hdparm -t /dev/mapper/galerkin_storage
/dev/mapper/galerkin_storage:
Timing buffered disk reads: 84 MB in 3.03 seconds = 27.73 MB/sec
使用 dd
dd if=/dev/mapper/galerkin_storage of=/dev/null bs=100M count=100
100+0 records in
100+0 records out
10485760000 bytes (10 GB) copied, 369.272 s, 28.4 MB/s
顶部输出
top - 10:29:49 up 3 days, 19:18, 2 users, load average: 2.14, 2.69, 1.69
Tasks: 148 total, 2 running, 146 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 15.8 sy, 0.0 ni, 81.4 id, 0.8 wa, 0.0 hi, 2.0 si, 0.0 st
KiB Mem: 4044256 total, 1586852 used, 2457404 free, 1070080 buffers
KiB Swap: 7812496 total, 115916 used, 7696580 free, 67056 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13963 root 20 0 0 0 0 R 84.9 0.0 3:55.93 kworker/2:0
13773 root 20 0 0 0 0 S 30.3 0.0 2:38.38 kworker/3:2
14158 root 20 0 109m 100m 572 D 18.2 2.5 0:08.50 dd
14170 robert 20 0 23168 1448 1076 R 6.1 0.0 0:00.02 top
1 root 20 0 10648 708 704 S 0.0 0.0 0:05.26 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.17 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 1:05.31 ksoftirqd/0
5 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u:0
6 root rt 0 0 0 0 S 0.0 0.0 0:00.14 migration/0
我的结论
写入性能似乎受到我的 CPU 性能的限制,因为 top 报告 kworker 使用了 60-98% 的 CPU。我可以接受我的 Intel Atom 双核是为了性能。令我惊讶的是,读取性能 (1) 低于写入性能,并且 (2) 似乎不受 CPU 性能的限制。
我的想法是,读取性能应该大致等于写入性能吗?我应该简单地更新到最新版本的 debian,而不是进行考古吗?我使用的用于读取的 cryptsetup 版本 (1.4.3) 是否没有写入那么多线程?写入似乎使用了 4 个不同的线程,而写入使用了 4 个?
我已经看过这个问题Debian Squeeze 下 LUKS/LVM/RAID 组合的性能非常差但我似乎没有遇到同样的问题,因为我的顶部输出显示 kryptd 有 4 个进程,这表明我的 cryptsetup 确实是多线程的。
背景信息
raid1 阵列目前仅包含 1 个驱动器,因为我想将它们相互比较。luksDump 我的加密介质
LUKS header information for /dev/md3
Version: 1
Cipher name: aes
Cipher mode: cbc-essiv:sha256
Hash spec: sha1
Payload offset: 4096
MK bits: 256
MK digest:
MK salt:
MK iterations: 12250
UUID: 022e94a0-9dce-45c1-806b-9fb54cfabf9b
Key Slot 0: ENABLED
Iterations: 49360
Salt:
Key material offset: 8
AF stripes: 4000
Key Slot 1: DISABLED
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Key Slot 6: DISABLED
Key Slot 7: DISABLED
核心
uname -ra
Linux galerkin 3.2.0-4-amd64 #1 SMP Debian 3.2.73-2+deb7u2 x86_64 GNU/Linux
Debian 版本
cat /etc/debian_version
7.9
Cryptsetup 版本
cryptsetup --version
cryptsetup 1.4.3
加密数组设置为
cryptsetup -v luksFormat /dev/md3 --key-file=/root/key-file
突袭阵列设置了
mdadm --create /dev/md3 --level=1 --raid-devices=2 /dev/sda4 missing
处理器信息
cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 28
model name : Intel(R) Atom(TM) CPU D525 @ 1.80GHz
stepping : 10
microcode : 0x107
cpu MHz : 1800.136
cache size : 512 KB
CPU 报告为上面的 4。
编辑:标题中给出的版本错误。正确版本是 7.9 (Wheezy)。
编辑:更新至 cryptsetup 1.6.6
cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1 204800 iterations per second
PBKDF2-sha256 151703 iterations per second
PBKDF2-sha512 79824 iterations per second
PBKDF2-ripemd160 169562 iterations per second
PBKDF2-whirlpool 30913 iterations per second
# Algorithm | Key | Encryption | Decryption
aes-cbc 128b 39.5 MiB/s 43.5 MiB/s
serpent-cbc 128b 29.3 MiB/s 32.0 MiB/s
twofish-cbc 128b 34.0 MiB/s 46.4 MiB/s
aes-cbc 256b 30.6 MiB/s 32.8 MiB/s
serpent-cbc 256b 29.8 MiB/s 32.0 MiB/s
twofish-cbc 256b 34.4 MiB/s 46.5 MiB/s
aes-xts 256b 43.0 MiB/s 44.2 MiB/s
serpent-xts 256b 31.5 MiB/s 32.3 MiB/s
twofish-xts 256b 33.1 MiB/s 34.2 MiB/s
aes-xts 512b 32.7 MiB/s 33.2 MiB/s
serpent-xts 512b 31.8 MiB/s 32.3 MiB/s
twofish-xts 512b 33.4 MiB/s 34.1 MiB/s
使用 cryptsetup 1.6.6 对加密阵列进行新的性能测量
写入性能
dd if=/dev/zero of=/dev/mapper/galerkin_storage bs=100M count=100
100+0 records in
100+0 records out
10485760000 bytes (10 GB) copied, 207.493 s, 50.5 MB/s
写入期间的最高记录
top - 21:42:48 up 22 min, 2 users, load average: 2.96, 1.07, 0.69
Tasks: 142 total, 7 running, 135 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 12.6 sy, 0.0 ni, 82.6 id, 4.2 wa, 0.0 hi, 0.4 si, 0.0 st
KiB Mem: 4044256 total, 3252544 used, 791712 free, 2721776 buffers
KiB Swap: 7812496 total, 44 used, 7812452 free, 65520 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4379 root 20 0 0 0 0 R 93.5 0.0 0:24.72 kworker/1:2
4377 root 20 0 0 0 0 R 82.5 0.0 0:03.55 kworker/2:0
4378 root 20 0 0 0 0 R 82.5 0.0 0:31.93 kworker/3:1
4336 root 20 0 0 0 0 R 55.0 0.0 0:33.53 kworker/0:0
189 root 20 0 0 0 0 S 44.0 0.0 0:13.94 md3_raid1
4380 root 20 0 105m 100m 540 R 11.0 2.5 0:09.26 dd
4396 robert 20 0 23348 1396 1032 R 11.0 0.0 0:00.03 top
1 root 20 0 15468 900 740 S 0.0 0.0 0:01.15 init
读取性能
dd if=/dev/mapper/galerkin_storage of=/dev/null bs=100M count=100
100+0 records in
100+0 records out
10485760000 bytes (10 GB) copied, 368.387 s, 28.5 MB/s
阅读时的最高记录:
top - 21:25:17 up 4 min, 2 users, load average: 0.57, 0.20, 0.09
Tasks: 141 total, 2 running, 139 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.8 us, 3.7 sy, 0.0 ni, 91.9 id, 3.6 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem: 4044256 total, 1055628 used, 2988628 free, 611612 buffers
KiB Swap: 7812496 total, 0 used, 7812496 free, 130004 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11 root 20 0 0 0 0 R 54.5 0.0 0:07.55 kworker/0:1
30 root 20 0 0 0 0 S 30.3 0.0 0:10.40 kworker/2:1
9 root 20 0 0 0 0 S 24.2 0.0 0:02.59 kworker/1:0
4287 root 20 0 105m 100m 540 D 24.2 2.5 0:04.63 dd
4288 root 20 0 0 0 0 S 12.1 0.0 0:04.24 kworker/3:2
4306 robert 20 0 23348 1404 1032 R 6.1 0.0 0:00.02 top
1 root 20 0 15468 900 740 S 0.0 0.0 0:01.13 init
使用 hdparm
hdparm -t /dev/mapper/galerkin_storage
/dev/mapper/galerkin_storage:
Timing buffered disk reads: 84 MB in 3.06 seconds = 27.44 MB/sec
因此,读取性能仍然远低于写入性能。如果我正确解释 luksDump,我有一个 256 位 aes-cbc。基准测试命令表明它的读取性能应该在我的 dd 基准测试范围内。然而,写入性能却出乎意料地高。有一件事让我印象深刻。我之前用 /dev/zero 填充了加密分区,那么是不是因为数据已经为零,所以不需要执行写入操作?