Finding factorial of a large number in assembly, not work when n>=15

Question

I'm using MASM and dosbox to do this, basically converting the C version into assembly.

int main()
{
    int a[20001];
    int temp,digit,n,i,j=0;
    scanf("%d",&n);
    a[0]=1;
    digit=1;
    for(i=2;i<=n;i++)
    {
        int num=0;
        for(j=0;j<digit;j++) 
        {
            temp=a[j]*i+num;
            a[j]=temp%10;
            num=temp/10;
        }
        while(num)
        {
            a[digit]=num%10;
            num=num/10;
            digit++;
        }
    }
    for(i=digit-1;i>=0;i--)
        printf("%d",a[i]);
    printf("\n");
    return 0;

I think it can't have problem with overflow of registers if I follow this route. It works well from 1! to 14!, but gets stuck when calculating 15!.

enter image description here

Here is the code

  .MODEL SMALL,STDCALL
  .386


  .DATA
  digit db 0
  n     db ?
  i     db 0
  j     db 0
  num   db 0
  array db 10000 dup(0)



  .CODE
  main proc
    mov ax,@data
    mov ds,ax

;-----------------load input to n
    mov  bx, 0
    Newchar:
      mov  ah, 1
      int  21h 
      sub  al, 30h 
      jl  endinput
      cmp  al, 9    
      jg  endinput
      cbw  
      xchg   ax, bx
      mul   cx
      xchg  ax, bx    
      add   bx, ax 
      jmp   newchar 
    endinput:
    mov n,bl
;-----------------


    mov al,1
    mov array[0],al

    mov al,1
    mov digit,al

    mov ch,0
    mov cl,n
    sub cl,2
    firstloop:
      ;i = n - cx = al
      mov ah,0
      mov al,n
      sub ax,cx
      mov i,al
      ;num = 0 = dl
      mov dh,0
      mov dl,0
      mov num,dl

      ; j = 0 = bx
      mov bx,0
      secondloop:
        ; temp=a[j]*i+num;
        mov al,i
        mov ah,array[bx]
        mul ah;->ax
        add ax,dx
        mov dh,10;borrow bh, then turn it back to 0
        div dh
        ; a[j]=temp%10;
        mov array[bx],ah
        ; num=temp/10;
        mov dl,al
        mov num,dl
        mov dh,0

        add bx,1
        mov j,bl
        cmp bl,digit
        jl secondloop

      whileloop:
        mov dl,num
        mov dh,0
        cmp dx,0
        je tonext
        mov ax,dx
        mov dh,10;borrow bh, then turn it back to 0
        div dh
        mov dh,0
        mov j,bl
        mov bh,0;bx was j, turn it to digit
        mov bl,digit
        mov array[bx],ah
        ; num=num/10;
        mov dl,al
        ; digit++;
        add bl,1
        mov digit,bl
        mov bl,j;turn bx back to j

        cmp dx,0
        jne whileloop

      tonext:
        mov al,i
        add al,1
        mov i,al
        sub cx,1
        cmp al,n
        jle firstloop




      ; reversely print the array
      mov bh,0
      mov bl,digit
      sub bl,1

    printloop:
      MOV  DL,array[bx]
      add dl,30h
      mov ah,2
      int 21h
      sub bx,1
      cmp bx,0
      jnl printloop

      jmp exit;

    exit:
      mov ah,4ch
      int 21h
    main endp
    end main

I think the algorithm in the C version can avoid the common problem of register overflowing. So I don't know where to improve my code. I have two guesses:

There's still something there overflowing. But I can't find it.
There're some unknown restraints with dosbox

The code is sooo long, I'll greatly appreciate if someone can give me some advice.

Use a debugger to single-step your code, or interrupt it after it's been running for a while: which loop does the code get stuck in? Or does it crash by raising an exception? Look at registers / memory while you single-step. — Peter Cordes, Apr 30 '20 at 08:04
In your `whileloop` you reload `dl` from `num` at the beginning of each iteration, but you don't seem to update `num` with the new value after the division. — Michael, Apr 30 '20 at 08:20
BTW, use registers for pointers / indices; that's what they're for. Keeping local inner loop variables like `j` in static storage (like `static uint8_t j;`) is silly and bloats the code, making it harder to debug and follow. You haven't used `SI` or `DI` registers at all. Also note that you can simply leave `n` in memory and `cmp n, reg` or `cmp reg, n`. — Peter Cordes, Apr 30 '20 at 11:54
Also, your C uses signed `int`, your asm is written with `uint8_t` `mul` and `div` with 8-bit operand-size. That's probably fine for base-10 extended precision, although that in itself is very slow compared to using larger chunks. (base 10^9 in a 32-bit register is viable, or base 10^4 in a 16-bit register could work. You print those with `%04d`) — Peter Cordes, Apr 30 '20 at 11:57

Finding factorial of a large number in assembly, not work when n>=15

0 Answers0