/dev/mapper/dm_crypt-0
是一个 LUKS 设备,支持/dev/sdc3
:
$ sudo pv /dev/sdc3 >/dev/null
[ 503MiB/s]
$ sudo pv /dev/mapper/dm_crypt-0 >/dev/null
[72.0MiB/s]
因此,加密设备比原始设备慢得多。为什么?
top
说:
top - 20:07:52 up 9 min, 2 users, load average: 2.03, 1.42, 0.83
Tasks: 604 total, 3 running, 601 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 3.6 sy, 0.0 ni, 96.1 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
GiB Mem : 472.4 total, 397.7 free, 0.9 used, 73.8 buff/cache
GiB Swap: 8.0 total, 8.0 free, 0.0 used. 469.3 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4845 root 20 0 0 0 0 R 81.2 0.0 0:21.96 kworker/u101:2+kcryptd/253:0
4846 root 20 0 0 0 0 R 81.2 0.0 0:16.22 kworker/u101:4+kcryptd/253:0
4844 root 20 0 5640 2200 1960 D 13.8 0.0 0:08.04 pv
4725 tange 20 0 9912 4624 3220 R 1.0 0.0 0:06.55 top
所以这可能是由于只有 2 次kcryptd
解密。该系统有48个核心。
如果性能受到这 2 个限制kcryptd
,我如何使用更多的 48 个核心来完成工作并获得 500 MB/s 的速度?
我测试了一下,认为这可能是单线程问题:
parallel --recend '' --pipepart -a /dev/mapper/dm_crypt-0 --block -1 'cat >/dev/null'
kcryptd
根据 中的规定,这提供了完整的 500 MB/s 并激活了更多的s top
。太好了,因为这意味着 LUKS能提供 500 MB/s 的速度。
顺序写入也很好(不是 500 MB/s,而是 300 MB/s 左右 - 可能受到 SSD 速度的限制)。
所以问题似乎仅限于顺序读取。
$ cat /proc/cpuinfo
processor : 47
vendor_id : AuthenticAMD
cpu family : 16
model : 9
model name : AMD Opteron(tm) Processor 6174
stepping : 1
microcode : 0x10000d9
cpu MHz : 2200.035
cache size : 512 KB
physical id : 1
siblings : 12
core id : 5
cpu cores : 12
apicid : 27
initial apicid : 27
fpu : yes
fpu_exception : yes
cpuid level : 5
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate vmmcall npt lbrv svm_lock nrip_save pausefilter
bugs : tlb_mmatch fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2
bogomips : 4400.20
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate
$ sudo cryptsetup luksDump /dev/sdc3
LUKS header information
Version: 2
Epoch: 18
Metadata area: 16384 [bytes]
Keyslots area: 16744448 [bytes]
UUID: 9498ddbe-9613-4ae3-8fb4-e65913d800c8
Label: (no label)
Subsystem: (no subsystem)
Flags: (no flags)
Data segments:
0: crypt
offset: 16777216 [bytes]
length: (whole device)
cipher: aes-xts-plain64
sector: 512 [bytes]
$ sudo cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1 260580 iterations per second for 256-bit key
PBKDF2-sha256 717220 iterations per second for 256-bit key
PBKDF2-sha512 565574 iterations per second for 256-bit key
PBKDF2-ripemd160 403919 iterations per second for 256-bit key
PBKDF2-whirlpool 262669 iterations per second for 256-bit key
argon2i 4 iterations, 256407 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id 4 iterations, 260616 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
# Algorithm | Key | Encryption | Decryption
aes-cbc 128b 74.8 MiB/s 130.1 MiB/s
serpent-cbc 128b 47.6 MiB/s 144.4 MiB/s
twofish-cbc 128b 116.1 MiB/s 137.4 MiB/s
aes-cbc 256b 33.5 MiB/s 68.5 MiB/s
serpent-cbc 256b 53.4 MiB/s 144.3 MiB/s
twofish-cbc 256b 125.7 MiB/s 150.8 MiB/s
aes-xts 256b 130.3 MiB/s 132.6 MiB/s
serpent-xts 256b 119.3 MiB/s 138.9 MiB/s
twofish-xts 256b 135.4 MiB/s 144.2 MiB/s
aes-xts 512b 99.0 MiB/s 102.7 MiB/s
serpent-xts 512b 136.8 MiB/s 138.9 MiB/s
twofish-xts 512b 144.9 MiB/s 144.3 MiB/s