LUKS:加快解密速度

LUKS:加快解密速度

/dev/mapper/dm_crypt-0是一个 LUKS 设备,支持/dev/sdc3

$ sudo pv /dev/sdc3 >/dev/null
[ 503MiB/s]

$ sudo pv /dev/mapper/dm_crypt-0  >/dev/null
[72.0MiB/s]

因此,加密设备比原始设备慢得多。为什么?

top说:

top - 20:07:52 up 9 min,  2 users,  load average: 2.03, 1.42, 0.83
Tasks: 604 total,   3 running, 601 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  3.6 sy,  0.0 ni, 96.1 id,  0.2 wa,  0.0 hi,  0.0 si,  0.0 st
GiB Mem :    472.4 total,    397.7 free,      0.9 used,     73.8 buff/cache
GiB Swap:      8.0 total,      8.0 free,      0.0 used.    469.3 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                        
   4845 root      20   0       0      0      0 R  81.2   0.0   0:21.96 kworker/u101:2+kcryptd/253:0                   
   4846 root      20   0       0      0      0 R  81.2   0.0   0:16.22 kworker/u101:4+kcryptd/253:0                   
   4844 root      20   0    5640   2200   1960 D  13.8   0.0   0:08.04 pv                                             
   4725 tange     20   0    9912   4624   3220 R   1.0   0.0   0:06.55 top                                            

所以这可能是由于只有 2 次kcryptd解密。该系统有48个核心。

如果性能受到这 2 个限制kcryptd,我如何使用更多的 48 个核心来完成工作并获得 500 MB/s 的速度?

我测试了一下,认为这可能是单线程问题:

parallel --recend '' --pipepart -a /dev/mapper/dm_crypt-0 --block -1 'cat >/dev/null'

kcryptd根据 中的规定,这提供了完整的 500 MB/s 并激活了更多的s top。太好了,因为这意味着 LUKS提供 500 MB/s 的速度。

顺序写入也很好(不是 500 MB/s,而是 300 MB/s 左右 - 可能受到 SSD 速度的限制)。

所以问题似乎仅限于顺序读取。

$ cat /proc/cpuinfo
processor : 47
vendor_id : AuthenticAMD
cpu family : 16
model    : 9
model name : AMD Opteron(tm) Processor 6174
stepping : 1
microcode : 0x10000d9
cpu MHz  : 2200.035
cache size : 512 KB
physical id  : 1
siblings : 12
core id  : 5
cpu cores : 12
apicid   : 27
initial apicid  : 27
fpu     : yes
fpu_exception   : yes
cpuid level  : 5
wp      : yes
flags    : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate vmmcall npt lbrv svm_lock nrip_save pausefilter
bugs    : tlb_mmatch fxsave_leak sysret_ss_attrs null_seg spectre_v1 spectre_v2
bogomips : 4400.20
TLB size : 1024 4K pages
clflush size : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

$ sudo cryptsetup luksDump /dev/sdc3
LUKS header information
Version:        2
Epoch:          18
Metadata area:  16384 [bytes]
Keyslots area:  16744448 [bytes]
UUID:           9498ddbe-9613-4ae3-8fb4-e65913d800c8
Label:          (no label)
Subsystem:      (no subsystem)
Flags:          (no flags)

Data segments:
  0: crypt
     offset: 16777216 [bytes]
     length: (whole device)
     cipher: aes-xts-plain64
     sector: 512 [bytes]

$ sudo cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1       260580 iterations per second for 256-bit key
PBKDF2-sha256     717220 iterations per second for 256-bit key
PBKDF2-sha512     565574 iterations per second for 256-bit key
PBKDF2-ripemd160  403919 iterations per second for 256-bit key
PBKDF2-whirlpool  262669 iterations per second for 256-bit key
argon2i       4 iterations, 256407 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
argon2id      4 iterations, 260616 memory, 4 parallel threads (CPUs) for 256-bit key (requested 2000 ms time)
#     Algorithm |       Key |      Encryption |      Decryption
        aes-cbc        128b        74.8 MiB/s       130.1 MiB/s
    serpent-cbc        128b        47.6 MiB/s       144.4 MiB/s
    twofish-cbc        128b       116.1 MiB/s       137.4 MiB/s
        aes-cbc        256b        33.5 MiB/s        68.5 MiB/s
    serpent-cbc        256b        53.4 MiB/s       144.3 MiB/s
    twofish-cbc        256b       125.7 MiB/s       150.8 MiB/s
        aes-xts        256b       130.3 MiB/s       132.6 MiB/s
    serpent-xts        256b       119.3 MiB/s       138.9 MiB/s
    twofish-xts        256b       135.4 MiB/s       144.2 MiB/s
        aes-xts        512b        99.0 MiB/s       102.7 MiB/s
    serpent-xts        512b       136.8 MiB/s       138.9 MiB/s
    twofish-xts        512b       144.9 MiB/s       144.3 MiB/s

相关内容