0

In order to test our MIPS implementation on a FPGA, I wrote a couple of assembler test code. I'm using mips-linux-gnu-as for compilation.

The following code is to test the divu implementation, from the code, the condition beq $t2, $t1, label1 is expected to pass and 3 is loaded in $t1.

.set noreorder
.text
__start:
    li  $t1, 10         
    li  $t2, 2

    divu $t1, $t2
    mflo $t2
    li $t1, 5
    beq $t2, $t1, label1
    j label2
label1:
    li  $t1, 3
label2:
    nop

My problem is that gnu-as is producing weird output for the above code, which ends in a different behaviour of the execution, and 3 is never loaded in $t1.

The diassembly of section .text

00000000 <__start>:
0:   2409000a        addiu   t1,zero,10
4:   240a0002        addiu   t2,zero,2
8:   15400002        bnez    t2,14 <__start+0x14>
c:   012a001b        divu    zero,t1,t2
10:   0007000d        break   0x7
14:   00004812        mflo    t1
18:   00005012        mflo    t2
1c:   24090005        addiu   t1,zero,5
20:   11490001        beq     t2,t1,28 <label1>
24:   0800000b        j       2c <label2>

00000028 <label1>:
28:   24090003        addiu   t1,zero,3

0000002c <label2>:
2c:   00000000        sll     zero,zero,0x0

Any help is greatly appreciated.

1 Answers1

1

Branches and jumps may not occur in delay slots. You have j in beq's delay slot. Also, I don't think you want addiu t1, zero, 3 in j's delay slot either. You should probably use .set reorder instead of .set noreorder. Alternatively, you need to learn how to deal with branch delay slots.

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
  • My problem is that I don't understand the code generated by gcc for the divu instruction. From my understand of RISC pipelines, is that the divu will always be executed. But why the generated code is loading the value of lo in t1 (mflo t1)? I mean t1 should hold 10 as a value... – Ali Abdallah Sep 26 '16 at 15:34
  • @AliAbdallah I don't know either what the extra mflo is doing there. It is possible that the compiler is generating it to work around a CPU bug (e.g. to insert a delay to allow more time for division). In your context this extra instruction is useless and harmless, at least on modern MIPS CPUs. But this has nothing to do with the real problem, which is how you're handling branch delay slots. – Alexey Frunze Sep 27 '16 at 05:27