I cannot reproduce your observations. Here below is what I measure, along with two other tests with different types of arrays. Note first that my setup script does the following:
(sb-ext:restrict-compiler-policy 'debug 3)
(sb-ext:restrict-compiler-policy 'safety 3)
This might have an impact on code compilation.
Your tests
(print :list)
(let ((x (list 0 1 2 3 4 5 6 7 8 9)))
(time (dotimes (i 1000000000) (setf (nth 9 x) 0))))
(print :simple-vector)
(let ((x (vector 0 1 2 3 4 5 6 7 8 9)))
(time (dotimes (i 1000000000) (setf (aref x 9) 0))))
The trace is as follows:
:LIST
Evaluation took:
6.649 seconds of real time
6.649545 seconds of total run time (6.649545 user, 0.000000 system)
100.02% CPU
21,224,869,441 processor cycles
0 bytes consed
:SIMPLE-VECTOR
Evaluation took:
0.717 seconds of real time
0.716708 seconds of total run time (0.716708 user, 0.000000 system)
100.00% CPU
2,287,130,610 processor cycles
0 bytes consed
Regarding your original question, there is an almost 10 speed factor (9.27) difference between using lists and vectors. I don't know why your tests show differently (edit: so apparently it was related to undefined behaviours, I saw you mutated constant data and fixed it in my code without thinking it could be related :/)
Specialized arrays
I also tried to describe the array as precisely as possible, by declaring in its type the kind of elements it holds, (mod 10)
, and the array dimensions, (10)
. The actual array element type in SBCL is (UNSIGNED-BYTE 4)
, which means the array is likely to packed: multiple array elements can be stored in a machine word (32 or 64bits). Packing is good for space but needs coding/decoding steps when storing/accessing elements.
(print :packed-array)
(let ((x (make-array 10
:element-type '(mod 10)
:initial-contents '(0 1 2 3 4 5 6 7 8 9))))
(declare (type (array (mod 10) (10)) x)
(optimize (speed 3)))
(time (dotimes (i 1000000000)
(setf (aref x 9) (the (mod 10) 0)))))
I also have a version where the actual array element type (unsigned-byte 32)
, which should avoid doing conversions:
(print :unpacked-array)
(let ((x (make-array 10
:element-type '(unsigned-byte 32)
:initial-contents '(0 1 2 3 4 5 6 7 8 9))))
(declare (type (array (unsigned-byte 32) (10)) x)
(optimize (speed 3)))
(time (dotimes (i 1000000000)
(setf (aref x 9) (the (mod 10) 0)))))
Additional tests:
:PACKED-ARRAY
Evaluation took:
1.168 seconds of real time
1.167929 seconds of total run time (1.167929 user, 0.000000 system)
100.00% CPU
3,727,528,968 processor cycles
0 bytes consed
Here you can see that using the most precise type actually slows the tests in that particular case. This can probably be explained by the overhead of decoding/encoding operations (?)
:UNPACKED-ARRAY
Evaluation took:
0.231 seconds of real time
0.231094 seconds of total run time (0.231094 user, 0.000000 system)
100.00% CPU
737,633,062 processor cycles
0 bytes consed
With arrays of 32bits values, the execution time is smaller.
If you call disassemble
for tests on both kinds of vectors (just remove the call to time
that adds a lot of noise, and maybe lower the debug levels), eventually the difference boils down to which type of register is used in the loop:
The simple vector uses 64 bits (RCX) because the vector must be able to store objects of type T:
; C0: L2: B900000000 MOV ECX, 0
; C5: 48894A49 MOV [RDX+73], RCX
; C9: 4883C002 ADD RAX, 2
; CD: L3: 483D00943577 CMP RAX, 2000000000
; D3: 7CEB JL L2
The test with 32 bits array uses the ECX (32 bits) register.
; 960: L2: 31C9 XOR ECX, ECX
; 962: 894A25 MOV [RDX+37], ECX
; 965: 4883C002 ADD RAX, 2
; 969: L3: 483D00943577 CMP RAX, 2000000000
; 96F: 7CEF JL L2
Also, the offsets are different, but consistent with the types:
- 37 = 4 (bytes) * 9 (index) + 1 (storage for array size?)
- 73 = 8 (bytes) * 9 (index) + 1 (storage for array size?)