I'm solving problem in a book called 'Computer Systems'. Here is the problem I'm struggling with.
Question: The following code computes the 128-bit product of two 64-bit signed values x and y and stores the result in memory:
1 typedef __int128 int128_t;
2
3 void store_prod(int128_t *dest, int64_t x, int64_t y){
4 *dest = x * (int128_t) y;
5 }
Gcc generates the following assembly code implementing the computation:
1 store_prod:
2 movq %rdx, %rax
3 cqto
4 movq %rsi, %rcx
5 sarq $63, %rcx
6 imulq %rax, %rcx
7 imulq %rsi, %rdx
8 addq %rdx, %rcx
9 mulq %rsi
10 addq %rcx, %rdx
11 movq %rax, (%rdi)
12 movq %rdx, 8(%rdi)
13 ret
This code uses Three multiplications for the multiprecision arithmetic required to implement 128-bit arithmetic on a 64-bit machine. Describe the algorithm used to compute the product, and annotate the assembly code to show how it realizes your algorithm.
I tried to annotate each assembly code. But I'm totally lost from 4th instruction. I understood how each assembly code works, but during the progress of combining them together I'm lost.
1 store_prod:
2 movq %rdx, %rax // copy y to rax
3 cqto // sign-extend to upper 8 bytes of rax
4 movq %rsi, %rcx // copy x to rcx
5 sarq $63, %rcx // right arithmetic shift 63 times (why..?)
6 imulq %rax, %rcx // multiply rcx by rax (why..?)
7 imulq %rsi, %rdx // multiply rdx by x (why..?)
8 addq %rdx, %rcx // add rdx to rcx (why..?)
9 mulq %rsi // multiply by x [rax = x*y]
10 addq %rcx, %rdx // add rcx to xy (why..?)
11 movq %rax, (%rdi) // store rax at dest
12 movq %rdx, 8(%rdi) // store rdx at dest+8
13 ret //
Sorry for my broken English, I hope you understood what I'm saying.