1

I was looking for an answer to this but couldn't find a clear one. Does it split the number between multiple registers or it can't simply cope with it? I tried testing it with MARS and using the number 4294967296 which is 0x100000000 but the register saved only 0x00000000, so the '1' bit is omitted. Is there a way to cope with such numbers?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
PyFox
  • 383
  • 3
  • 11
  • If you use only 32 bit register, then yes, everything above is thrown away, and values are "truncated" down to 32 bit only. So to store more bits you must use more bits (multiple registers, multiple words in memory, etc...) nothing in CPU will inflate dynamically just because suddenly 32 bits are not enough, all sizes are fixed in asm code, and the code must be prepared to handle more bits, otherwise it will not. – Ped7g Nov 10 '18 at 16:55

1 Answers1

6

Use 2 registers, an extra one for the high half. MIPS doesn't have flags, so there isn't a 2-instruction add/add-with-carry way to add int64_t like there is on many other ISAs, but you can look at compiler output for a C function that adds two 64-bit integers easily enough.

#include <stdint.h>

int64_t add64(int64_t a, int64_t b) { 
    return a+b;
}

compiled for MIPS on the Godbolt compiler explorer with gcc5.4 -O3 -fno-delayed-branch1:

add64:
    addu    $3,$5,$7
    sltu    $5,$3,$5     # check for carry-out with  sum_lo < a_lo  (unsigned)
    addu    $2,$4,$6     # add high halves
    addu    $2,$5,$2     # add the carry-out from the low half into the high half
    j       $31          # return
    nop                # branch-delay slots

Footnote 1: so GCC only fills the branch-delay slot with a NOP, not a real instruction. So the same code would work on a simplified MIPS without delay slots like MARS simulates by default.


In memory, MIPS in big-endian mode (the more common choice for MIPS) stores the entire 64-bit integer in big-endian order, thus the "high half" (most significant 32 bits) is at the lower address, so the highest byte of that word is at the lowest address, and all 8 bytes are in descending order of place value.

void add64_store(int64_t a, int64_t b, int64_t *res) { 
    *res = a+b;
}

  ## gcc5.4 -O3 for MIPS - big-endian, not MIPS (el)
    addu    $7,$5,$7        # low half
    lw      $2,16($sp)
    sltu    $5,$7,$5        # carry-out
    addu    $4,$4,$6        
    addu    $5,$5,$4        # high half
    sw      $5,0($2)        # store the high half to res[0..3] (byte offsets)
    sw      $7,4($2)        # store the low  half to res[4..7]
    j       $31
    nop                   # delay slot

As you can see from the register numbers used, the calling convention passes the high half in the lower-numbered register (earlier arg), unlike on little-endian architectures where the high-half goes in the later arg-passing slot. This makes things work as desired if you run out of register and an int64_t is passed on the stack.


On an architecture with flags and an add-with-carry instruction (like ARM32 for example), you get an add instruction that creates a 33 bit result in C:R0 (top bit in the carry flag, lower 32 in a register).

add64:
    adds    r0, r2, r0    @ ADD and set flags
    adc     r1, r3, r1    @ r1 = r1 + r3 + carry
    bx      lr

You tagged this MIPS32, so you don't have 64-bit extensions to the ISA available. That was introduced in MIPS III in 1991, but for embedded use MIPS32 is a modern MIPS with extensions other than 64-bit registers.

The same reasoning applies to 128-bit integers on 64-bit MIPS with daddu

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Great answer. Do you know what there is a command (in both samples) after the j $31? I thought this means jump (back to the callee), so how can there be inportant code afterwards? – lalala Sep 12 '20 at 08:23
  • 2
    GCC is targeting real MIPS, which has a [branch-delay slot](https://en.wikipedia.org/wiki/Delay_slot#Branch_delay_slots). The first instruction after a branch runs unconditionally even if the branch is taken. My comments on the first code block point this out. You could compile with `-fno-delayed-branch` to have GCC just fill it with a NOP so you could run the code in a simulator for a simplified MIPS without a delay slot (like MARS's default setting.) – Peter Cordes Sep 12 '20 at 09:51
  • 1
    @lalala: updated my answer to use `-fno-delayed-branch` since most people these days learning about MIPS are apparently not learning about real MIPS, instead simplified MIPS without a delay slot. – Peter Cordes Sep 12 '20 at 10:14