As you say, the bug that was stopping it from working was lack of fixing the stack after the recursive call. But that wasn't the only bug
_multiply_fact
is not really a function. It's just part of _factorial
's code, which runs when you don't take the early-out n <= 1
return path. So you should not be making a stack frame in it.
The pop ebp
at the top of _multiply_fact
is totally bogus. It only works because the top of the stack when that runs already has the caller's ebp
. If you called it directly as a function, you'd clobber the caller's ebp
with the return address.
First of all, you don't need to make a stack frame at all, so it's just a total waste of stack space and instructions. (Although it does help debuggers do backtraces, since without it they need info that compilers normally add, but hand-written asm functions don't have unless you add it manually.)
Your _factorial
also clobbers the caller's ebx
, which violates the ABI. I don't think your function will actually get the right value, since it depends on ebx
surviving across the call to _factorial
. After all the recursive calls return, ebx=1
, because each nested call clobbers ebx with its arg.
Since you're writing in asm, though, you can have your recursive self-calls make assumptions about which registers aren't clobbered, or even pass args in regs if you want. You still need to save n
across the recursive call somehow, though. One option is to take advantage of the fact that you know _factorial
doesn't clobber its arg on the stack (even though the ABI allows it to).
You still need the publicly accessible wrapper function to follow the ABI, though.
Obviously recursion is a big waste of time for factorial in the first place, compare to a loop, but here's a version that sucks as little as possible.
global _factorial
_factorial:
; one arg: int n
push ebp
mov ebp, esp ; make a stack frame
mov edx, [ebp + 8] ; load first arg into a scratch reg
dec edx
jg .multiply_fact ; if ((n-1) > 0). Note that dec doesn't set CF, so just use jnz instead of ja if you want to convert this to unsigned
;; base case
mov eax, 1 ;if (n <= 1) ret 1
pop ebp
ret
.multiply_fact: ; not a separate function, still part of _factorial
; in NASM, .labels are local labels that don't show up as symbols in the object file. MASM uses something different.
; at this point: edx = n-1
push edx
call _factorial ; eax = fac(n-1)
pop edx ; get n-1 back off the stack. Taking advantage of extra knowledge of our own internals, since the ABI allows functions to clobber their args on the stack
; add esp, 4 not needed, because we popped instead
inc edx ; undo the earlier dec before we pushed
imul eax, edx ; n * fac(n-1). 2-arg imul runs faster
pop ebp
ret