Maybe I'm missing something especially clever, but I cannot figure out why you are trying to use a left-shift operation to perform exponentiation. You would do a left-shift to achieve a binary multiplication (multiplication by a power of 2). For example, if you wanted to multiply n by 2, you would shift n left by 1. Shifting n left by 5 would be equivalent to n × 32.
For exponentiation, you want n × n × n × …. You can't get that with a left-shift operation. You need a good old multiplication.
x * y5
can be rewritten as:
x * (y * y * y * y * y)
or:
temp = (y * y)
x * temp * temp * y
The third is, in fact, what a C compiler would transform the second formula into, since it is a basic optimization that elides a multiplication.
Assuming that you have x
in rsi
, and y
in rcx
, in assembly, this would be:
movq %rcx, %rax ; make a copy of 'y'
imulq %rcx, %rax ; y * y
imulq %rcx, %rsi ; (y * x
imulq %rax, %rsi ; y * x * (y * y)
imulq %rsi, %rax ; y * x * (y * y) * (y * y)
ret ; result is in RAX
Simple enough, and imulq
is going to be nearly as efficient as shlq
on modern, 64-bit processors. So this is not slow code, and more importantly, it is correct.
As for how you shift by a variable count, Jester has already answered that in the comments, but allow me to flesh it out a bit more. There are four basic encodings for the shift instruction on x86 (ignoring operand size, and just looking at operand type):
- Shift where the destination is a register and the source is an immediate/constant.
- Shift where the destination is memory and the source is an immediate/constant.
- Shift where the destination is a register and the source is the
cl
register.
- Shift where the destination is memory and the source is the
cl
register.
You can see this by looking at the documentation for one of the shift instructions. (For historical reasons, there is also a special encoding for shifting by 1. Not important here.) Note that the "source" and "destination" are a bit of a formality here. The "destination" is the value that is being shifted, as well as the place where the result is going to end up. The "source" is not actually a source; it is just the shift count.
So, most of what you see is shifting an enregistered value by an immediate, which is option #1, but you can also shift an enregistered value by a variable—the catch is that that variable must be in the cl
register. cl
is the lowest 8 bits (byte) of the rcx
register.
So if you wanted to do, say:
x * 2y
which is equivalent to:
x << y
and x
was in rsi
and y
was in rcx
, you could write that as:
shlq %cl, %rsi
movq %rsi, %rax
ret ; result is in RAX
Of course, since the shift count must be in cl
, and cl
is an 8-bit register, the shift count can never be greater than 255. That is not actually a problem, though, because it is meaningless to shift a 64-bit quantity by any more than 63.