MIPS Branch Addressing Algorithm and Opcode isolation from instruction binary?

Question

I just want to check my understanding of these two concepts is correct, as I have been trying to finish a project and while everything works to my expectations, it keeps narrowly failing the test cases and introducing a random value...

Basically, the objective of the project is to write out a branch instruction to console in this form:

BranchName $S, [$t, if applicable] 0xAbsoluteAddressOfBranchTargetInstruction

Edit: Clarification: I'm writing this in MIPS. The idea is I get a memory address in $a0 given to the program by my instructor's code (I write the function). The address is for the word containing a MIPS instruction. I'm to do the following:

Get instruction
Isolate instruction opcode and output its name to register (ie: opcode 5, output BNE), do nothing if it isn't a branch instruction.
Isolate $s, $t, and output as applicable (ie: no $t for bgez)
Use offset in the branch instruction to calculate its absolute address (the address of the target instruction following branch) and output in hex. For the purposes of this calculation, the address of the branch instruction ($a0) is assumed to be $pc.

IE:

BEQ $6, $9, 0x00100008

Firstly, is my understanding of branch calculation correct?

PC -> PC + 4
Lower 16 bits of instruction
<< 2 these lower bits
Add PC+4 and the left shifted lower 16 bits (only the lower 16 though).

Secondly, could somebody tell me which bits I need to isolate to know what kind of branch I'm dealing with? I think I have them (first 6 for BEQ/BNE, first 16 with $s masked out for others) but I wanted to double check.

Oh, and finally... should I expect deviation on SPIM from running it on an Intel x86 Windows system and an Intel x86 Linux system? I'm getting a stupid glitch and I cannot seem to isolate it from my hand-worked address calculations, but it only shows up when I run the test scripts my prof gave us on Linux (.sh); running directly in spim on either OS seems to work... provided my understanding of how to do the hand calculations (as listed above) is correct.

I don't think there will be a difference between Windows and Linux since SPIM uses the underlying "endiness" of the hardware. — Zack, Oct 04 '16 at 00:37
Try using MARS instead of SPIM and see if you get the same answer. (http://courses.missouristate.edu/KenVollmar/mars/) — Zack, Oct 04 '16 at 00:37
@Zack Do you think it has something to do with my pure argorithmic considerations of this? I read somewhere that Spim doesn't bother with PC+4, which might be leading to the mistake? — smitty_werbermanjensen, Oct 04 '16 at 04:16
`spim` obeys `PC+4` [as does `mars`]. But, mips has [can have] "branch delay slots". Normally, emulation for these is defaulted off for `spim/mars` but you may have enabled it somehow (e.g. registry entry, etc.). See my answer: http://stackoverflow.com/questions/36994841/jump-and-link-register-mips/36995613#36995613 If enabled but not compensated by you, you'd experience an "off-by-4" discrepancy — Craig Estey, Oct 04 '16 at 05:40
To sign extend in C, just do `word_offset = *(short *) ...`. To do it in mips, assuming `$t5` has the whole [branch] instruction value, do: `sll $t4,$t5,16`--this puts the [16 bit] sign bit into bit 31, followed by `sra $t4,$t4,16` [shift right _arithmetic_ which does the sign extension when shifting]. Now, `$t4` has the sign extended _word_ offset. To get the _byte_ offset, now do `sll $t4,$t4,2` — Craig Estey, Oct 04 '16 at 05:48
@CraigEstey That makes sense, with that said... I add the whole word at that point? — smitty_werbermanjensen, Oct 04 '16 at 16:02

score 1 · Answer 1 · answered Oct 04 '16 at 00:36

1

The 16 bit immediate value is sign-extended to 32 bits, then shifted. I don't know if that would affect your program; but, that's the only potential "mistake" I noticed.

answered Oct 04 '16 at 00:36

Zack

6,232
8
38
68

+1 for the probable issue, but, just as much for suggesting `mars` [written in java] above. Spim has two variants: `spim`, the command line version. And, `QtSpim` is the gui version. They're supposed to be 100% compatible, but I've sometimes noticed issues. It could be the default as to whether "branch delay slots" are enabled. – Craig Estey Oct 04 '16 at 03:44
@Zack So, with that said... how would I sign extend it manually prior to the shift? Also, do I add the whole word to PC then? – smitty_werbermanjensen Oct 04 '16 at 04:15
@smitty_werbermanjensen, I'm not sure I fully understand what you are trying to do. Are you writing a MIPS simulator in C? Did Craig's comment above help? – Zack Oct 04 '16 at 10:37
@zack: the objective of the assignment is to use mips assembly language to take a branch or other instruction word from a given memory address in $a0, deconstruct it for its parts, and then output. I think I've figured out how to isolate the opcode and I have the register output working, but when it gets to the absolute-address of jump (aka: after all the branch work, where does it go?) being printed out in hex, I always get an error. Say the absolute is 0x001008, I keep getting 0x0020008. – smitty_werbermanjensen Oct 04 '16 at 15:49
I see. That's an interesting assignment. Since the branch target is encoded in the lower 16 bits, I wonder if you can use `lh` (`load halfword`) to isolate the target. – Zack Oct 05 '16 at 00:09

Craig Estey · Accepted Answer · 2016-10-04T20:23:56.723

This is prefaced by my various comments.

Here is a sample program that does the address calculation correctly. It does not do the branch instruction type decode, so you'll have to combine parts of this and your version together.

Note that it uses the mars syscall 34 to print values in hex. This isn't available under spim, so you may need to output in decimal using syscall 1 or write your own hex value output function [if you haven't already]

    .data
msg_best:   .asciiz     "correct target address: "
msg_tgt:    .asciiz     "current target address: "
msg_nl:     .asciiz     "\n"

    .text
    .globl  main
main:
    la      $s0,inst                # pointer to branch instruction
    la      $s1,einst               # get end of instructions
    subu    $s1,$s1,$s0             # get number of bytes
    srl     $s1,$s1,2               # get number of instruction words
    la      $s2,loop                # the correct target address

    la      $a0,msg_best
    move    $a1,$s2
    jal     printaddr

loop:
    move    $a0,$s0
    jal     showme                  # decode and print instruction
    addiu   $s0,$s0,4
    sub     $s1,$s1,1
    bnez    $s1,loop                # more to do? yes, loop

    li      $v0,10
    syscall

    # branch instructions to decode
inst:
    bne     $s0,$s1,loop
    beq     $s0,$s1,loop
    beqz    $s1,loop
    bnez    $s1,loop
    bgtz    $s1,loop
    bgez    $s1,loop
    bltz    $s1,loop
    blez    $s1,loop
einst:

# showme -- decode and print data about instruction
#
# NOTE: this does _not_ decode the instruction type
#
# arguments:
#   a0 -- instruction address
#
# registers:
#   t5 -- raw instruction word
#   t4 -- branch offset
#   t3 -- absolute address of branch target
showme:
    subu    $sp,$sp,4
    sw      $ra,0($sp)

    lw      $t5,0($a0)              # get inst word
    addiu   $t3,$a0,4               # get PC + 4

    sll     $t4,$t5,16              # shift offset left
    sra     $t4,$t4,16              # shift offset right (sign extend)
    sll     $t4,$t4,2               # get byte offset

    addu    $t3,$t3,$t4             # add in offset

    # NOTE: as a diagnostic, we could compare t3 against s2 -- it should
    # always match

    la      $a0,msg_tgt
    move    $a1,$t3
    jal     printaddr

    lw      $ra,0($sp)
    addu    $sp,$sp,4
    jr      $ra

# printaddr -- print address
#
# arguments:
#   a0 -- message
#   a1 -- address value
printaddr:
    li      $v0,4
    syscall

    # NOTE: only mars supports this syscall
    # to use spim, use a syscall number of 1, which outputs in decimal and
    # then hand convert
    # or write your own hex output function
    move    $a0,$a1
    li      $v0,34                  # output number in hex (mars _only_)
    syscall

    la      $a0,msg_nl
    li      $v0,4
    syscall

    jr      $ra

I passed a modified script to get the integer values for the test cases (addr , instr) and compared those. As it stands, my whole issue was that key misunderstanding about sign extending the bits and adding the WHOLE word. I didn't read the rest of your code as it's an assignment, but you have my gratitude for helping me with the shift. I couldn't find that part in my slide notes, hence why I was doing it wrong. THANK YOU. — smitty_werbermanjensen, Oct 05 '16 at 01:55

MIPS Branch Addressing Algorithm and Opcode isolation from instruction binary?

2 Answers2