I have been setting up a raid1 array and have set up an encrypted medium using cryptsetup with default options. The raid arrays are supposed to use 2 drives, but for the moment, I only have 1 drive in each raid1 array, to compare the performance between them.
Performance
The unencrypted array
Write performance
dd if=/dev/zero of=/media/storage/Temp/test.img bs=100M count=10
10+0 records in
10+0 records out
1048576000 bytes (1.0 GB) copied, 7.35153 s, 143 MB/s
Top output:
top - 10:30:02 up 2 days, 19:18, 2 users, load average: 0.00, 0.16, 0.72
Tasks: 147 total, 3 running, 144 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 21.4 sy, 0.0 ni, 75.0 id, 0.9 wa, 0.0 hi, 2.7 si, 0.0 st
KiB Mem: 4044256 total, 1135880 used, 2908376 free, 224624 buffers
KiB Swap: 7812496 total, 123488 used, 7689008 free, 470796 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11591 root 20 0 109m 100m 572 R 98.5 2.5 0:03.12 dd
11592 root 20 0 0 0 0 R 98.5 0.0 0:00.24 flush-9:1
203 root 20 0 0 0 0 S 52.1 0.0 0:15.59 md1_raid1
Everything here seems like expected
Read performance
hdparm -t /dev/md1
/dev/md1:
Timing buffered disk reads: 574 MB in 3.01 seconds = 190.95 MB/sec
The encrypted array
Write performance
dd if=/dev/zero of=/dev/mapper/galerkin_storage bs=100M count=100
100+0 records in
100+0 records out
10485760000 bytes (10 GB) copied, 209.058 s, 50.2 MB/s
Top output:
top - 10:12:20 up 2 days, 19:00, 2 users, load average: 5.65, 2.92, 1.60
Tasks: 149 total, 6 running, 143 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 21.4 sy, 0.0 ni, 74.9 id, 0.9 wa, 0.0 hi, 2.7 si, 0.0 st
KiB Mem: 4044256 total, 3749816 used, 294440 free, 3155712 buffers
KiB Swap: 7812496 total, 132464 used, 7680032 free, 40892 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10940 root 20 0 0 0 0 R 99.0 0.0 1:49.99 kworker/2:1
11538 root 20 0 0 0 0 R 94.5 0.0 1:28.32 kworker/3:1
11486 root 20 0 0 0 0 R 63.0 0.0 2:13.37 kworker/1:2
11489 root 20 0 0 0 0 R 27.0 0.0 0:52.80 flush-253:0
10910 root 20 0 0 0 0 R 22.5 0.0 2:06.59 kworker/0:2
1305 root 20 0 0 0 0 S 18.0 0.0 338:40.46 md3_raid1
11490 root 20 0 0 0 0 S 13.5 0.0 1:31.37 kworker/0:1
11539 root 20 0 109m 100m 572 D 13.5 2.5 0:23.25 dd
Read performance
hdparm -t /dev/mapper/galerkin_storage
/dev/mapper/galerkin_storage:
Timing buffered disk reads: 84 MB in 3.03 seconds = 27.73 MB/sec
using dd
dd if=/dev/mapper/galerkin_storage of=/dev/null bs=100M count=100
100+0 records in
100+0 records out
10485760000 bytes (10 GB) copied, 369.272 s, 28.4 MB/s
top output
top - 10:29:49 up 3 days, 19:18, 2 users, load average: 2.14, 2.69, 1.69
Tasks: 148 total, 2 running, 146 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.1 us, 15.8 sy, 0.0 ni, 81.4 id, 0.8 wa, 0.0 hi, 2.0 si, 0.0 st
KiB Mem: 4044256 total, 1586852 used, 2457404 free, 1070080 buffers
KiB Swap: 7812496 total, 115916 used, 7696580 free, 67056 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13963 root 20 0 0 0 0 R 84.9 0.0 3:55.93 kworker/2:0
13773 root 20 0 0 0 0 S 30.3 0.0 2:38.38 kworker/3:2
14158 root 20 0 109m 100m 572 D 18.2 2.5 0:08.50 dd
14170 robert 20 0 23168 1448 1076 R 6.1 0.0 0:00.02 top
1 root 20 0 10648 708 704 S 0.0 0.0 0:05.26 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.17 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 1:05.31 ksoftirqd/0
5 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u:0
6 root rt 0 0 0 0 S 0.0 0.0 0:00.14 migration/0
My conclusion
The write performance seems to be limited by my CPU performance, since the top reports the kworker uses 60-98% CPU. I can accept that my Intel Atom dual core is for performance. What surprises me is that the read performance is (1) less than the write performance and (2) does not seem to be limited by CPU performance.
Is my notion, that the read performance should be roughly equal to the write performance? Should I simply update to the latest version of debian, and not do archaeology? Is the version of cryptsetup I'm using (1.4.3) for reading not so multithreaded as the writing? The write seems to use 4 different threads, while writing uses 4?
I have looked at the question Very poor performance on LUKS/LVM/RAID combination under Debian Squeeze but I don't seem to have the same issue, since my top output displays 4 processes for kryptd, suggesting my cryptsetup is really multithreading.
Background info
The raid1 arrays only includes 1 drive at the moment, because I wanted to compare them to each other. luksDump of my encrypted medium
LUKS header information for /dev/md3
Version: 1
Cipher name: aes
Cipher mode: cbc-essiv:sha256
Hash spec: sha1
Payload offset: 4096
MK bits: 256
MK digest:
MK salt:
MK iterations: 12250
UUID: 022e94a0-9dce-45c1-806b-9fb54cfabf9b
Key Slot 0: ENABLED
Iterations: 49360
Salt:
Key material offset: 8
AF stripes: 4000
Key Slot 1: DISABLED
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Key Slot 6: DISABLED
Key Slot 7: DISABLED
kernel
uname -ra
Linux galerkin 3.2.0-4-amd64 #1 SMP Debian 3.2.73-2+deb7u2 x86_64 GNU/Linux
Debian version
cat /etc/debian_version
7.9
Cryptsetup version
cryptsetup --version
cryptsetup 1.4.3
The encrypted array was set up with
cryptsetup -v luksFormat /dev/md3 --key-file=/root/key-file
The raid array was set up with
mdadm --create /dev/md3 --level=1 --raid-devices=2 /dev/sda4 missing
Cpuinfo
cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 28
model name : Intel(R) Atom(TM) CPU D525 @ 1.80GHz
stepping : 10
microcode : 0x107
cpu MHz : 1800.136
cache size : 512 KB
The CPU is reported as 4 of the above.
Edit: Wrong version given in title. Correct is 7.9 (Wheezy).
Edit: Updated to cryptsetup 1.6.6
cryptsetup benchmark
# Tests are approximate using memory only (no storage IO).
PBKDF2-sha1 204800 iterations per second
PBKDF2-sha256 151703 iterations per second
PBKDF2-sha512 79824 iterations per second
PBKDF2-ripemd160 169562 iterations per second
PBKDF2-whirlpool 30913 iterations per second
# Algorithm | Key | Encryption | Decryption
aes-cbc 128b 39.5 MiB/s 43.5 MiB/s
serpent-cbc 128b 29.3 MiB/s 32.0 MiB/s
twofish-cbc 128b 34.0 MiB/s 46.4 MiB/s
aes-cbc 256b 30.6 MiB/s 32.8 MiB/s
serpent-cbc 256b 29.8 MiB/s 32.0 MiB/s
twofish-cbc 256b 34.4 MiB/s 46.5 MiB/s
aes-xts 256b 43.0 MiB/s 44.2 MiB/s
serpent-xts 256b 31.5 MiB/s 32.3 MiB/s
twofish-xts 256b 33.1 MiB/s 34.2 MiB/s
aes-xts 512b 32.7 MiB/s 33.2 MiB/s
serpent-xts 512b 31.8 MiB/s 32.3 MiB/s
twofish-xts 512b 33.4 MiB/s 34.1 MiB/s
New performance measurements for the encrypted array with cryptsetup 1.6.6
Write Performance
dd if=/dev/zero of=/dev/mapper/galerkin_storage bs=100M count=100
100+0 records in
100+0 records out
10485760000 bytes (10 GB) copied, 207.493 s, 50.5 MB/s
top record during the write
top - 21:42:48 up 22 min, 2 users, load average: 2.96, 1.07, 0.69
Tasks: 142 total, 7 running, 135 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.2 us, 12.6 sy, 0.0 ni, 82.6 id, 4.2 wa, 0.0 hi, 0.4 si, 0.0 st
KiB Mem: 4044256 total, 3252544 used, 791712 free, 2721776 buffers
KiB Swap: 7812496 total, 44 used, 7812452 free, 65520 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4379 root 20 0 0 0 0 R 93.5 0.0 0:24.72 kworker/1:2
4377 root 20 0 0 0 0 R 82.5 0.0 0:03.55 kworker/2:0
4378 root 20 0 0 0 0 R 82.5 0.0 0:31.93 kworker/3:1
4336 root 20 0 0 0 0 R 55.0 0.0 0:33.53 kworker/0:0
189 root 20 0 0 0 0 S 44.0 0.0 0:13.94 md3_raid1
4380 root 20 0 105m 100m 540 R 11.0 2.5 0:09.26 dd
4396 robert 20 0 23348 1396 1032 R 11.0 0.0 0:00.03 top
1 root 20 0 15468 900 740 S 0.0 0.0 0:01.15 init
Read Performance
dd if=/dev/mapper/galerkin_storage of=/dev/null bs=100M count=100
100+0 records in
100+0 records out
10485760000 bytes (10 GB) copied, 368.387 s, 28.5 MB/s
top record during the reading:
top - 21:25:17 up 4 min, 2 users, load average: 0.57, 0.20, 0.09
Tasks: 141 total, 2 running, 139 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.8 us, 3.7 sy, 0.0 ni, 91.9 id, 3.6 wa, 0.0 hi, 0.1 si, 0.0 st
KiB Mem: 4044256 total, 1055628 used, 2988628 free, 611612 buffers
KiB Swap: 7812496 total, 0 used, 7812496 free, 130004 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11 root 20 0 0 0 0 R 54.5 0.0 0:07.55 kworker/0:1
30 root 20 0 0 0 0 S 30.3 0.0 0:10.40 kworker/2:1
9 root 20 0 0 0 0 S 24.2 0.0 0:02.59 kworker/1:0
4287 root 20 0 105m 100m 540 D 24.2 2.5 0:04.63 dd
4288 root 20 0 0 0 0 S 12.1 0.0 0:04.24 kworker/3:2
4306 robert 20 0 23348 1404 1032 R 6.1 0.0 0:00.02 top
1 root 20 0 15468 900 740 S 0.0 0.0 0:01.13 init
With hdparm
hdparm -t /dev/mapper/galerkin_storage
/dev/mapper/galerkin_storage:
Timing buffered disk reads: 84 MB in 3.06 seconds = 27.44 MB/sec
So, the read performance is still considerably less than the write performance. If I interpret the luksDump correctly, I have a 256bit aes-cbc. The benchmark command suggests it should have a read performance in the region of my dd benchmark. The write performance is however unexpectedly high. One thing that just struck me. I have before filled the encrypted partition with /dev/zero, so could it be that the writes are not needed to be performed, since the data is already zero?