0

Learning RiscV my lecturer defined a new command called Load Upper Immediate (lui) like this:

lui rt, imm

loads the lower halfword of the immediate imm into the upper halfword of register rt. The lower bits of the register are set to 0.

with the following image:

enter image description here

But there are few things which I can't understand:

  1. "loads the lower halfword of the immediate" what does this mean? I think if we have 32 bits in imm then it loads the first 16 am I right?

  2. Is the image correct at all? shouldn't the first half be all zeroes and the difination mentions? why we have those 0xf, 0, rt and where rt came from?

  3. Given the following command: lui S0, 0x1234
    What will it do? I don't know the value of location 1234 in memory...

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • `immediate` = numeric constant that is part of the encoded instruction. Also known as the number 0x1234. Does not stand for an adress, it's just a number. –  Dec 13 '20 at 18:35
  • And the image just shows you the instruction encoding (if you write this instruction in assembler, these bytes will appear in the resulting binary in this order...), there's nothing to be right or wrong about there. Unless you found some error in it. –  Dec 13 '20 at 18:38
  • @dratenik so 3 will save 1234/2 in S0? plus isn't 0x1234 a number in hex? I'm totally confused –  Dec 13 '20 at 18:44
  • To quote Erik's answer "s0 becomes 0x12340000". Yes, that's a hex number. –  Dec 13 '20 at 18:46
  • why it becomes like that? the answer is 32 bits not 8.... plus why didn't you take the half? –  Dec 13 '20 at 18:48
  • The immediate is 0x1234. If we view it as a word (32 bits) it is 0x00001234 and the lower half-word of that is 0x1234. Which we load into the upper half of s0 and fill the rest with zeroes, so s0 becomes 0x12340000. –  Dec 13 '20 at 18:59
  • "If we view it as a word (32 bits) it is 0x00001234" why? hex means base of 2 so 2 hex in one full 32 bit –  Dec 13 '20 at 19:18
  • "the lower half-word of that is 0x1234" you meant the opposite the lower of this is 0000 –  Dec 13 '20 at 19:32
  • 1
    The answers you were given are correct and as simple as people could make them. Now it's your turn to turn to your notes, review the basics and try to understand what you were told. I still feel you don't understand the meaning of the image with the instruction encoding so make extra sure you understand what you're looking at there. –  Dec 13 '20 at 19:32
  • man, immediate commands in risc just mean "do it right now!" "lui s0, 0x1234" means load 0x1234 into register called s0, and upper means not just 0x1234, but 0x12340000. 0x[some digits] means its hexadecimal. Upper(higher) bits are left bits, and right bits is lower bits. Its very useful command if you need load small constant quickly, but there is restriction only 20 bits for it. – Alexy Khilaev Dec 14 '20 at 05:44

3 Answers3

2

Is the image correct at all? shouldn't the first half be all zeroes and the difination mentions? why we have those 0xf, 0, rt and where rt came from?

Yes, it is correct; however, that image shows us it's a MIPS instruction, not RISC V1.

The 0xf is the MIPS opcode for lui.  There is an unused field of 5 bits (the zeros) and a register field along with the 16-bit immediate.

lui is not a memory reference instruction — it merely loads a constant stored in the instruction into the register.

Given the following command: lui S0, 0x1234 What will it do?

lui s0, 0x1234     ; s0 becomes 0x12340000

You can look it up in the MIPS green sheet.

Suffice it to say that using 2 instructions we can form a 32-bit constant value (data value or memory address).  The first instruction, lui forms the upper 16 bits of the 32-bit value and the 2nd instruction supplies the lower 16 bits using an ordinary ori or addi.

Load Upper Imm.        lui rt,imm      I-Type        R[rt] = {imm, 16’b0}

This one instruction is equivalent to the 2 instruction sequence:

ori rt, r0, imm         ; load 16-bit constant
sll rt, rt, 16          ; shift left by 16 bits

Footnote 1: Here's the RISC V version of LUI:

 31    12 11   7 6       0
+-------------------------+
|  imm   |  rd  |  opcode |
+-------------------------+
    20       5        7

Quite different as you can see: the immediate is 20 bits long (not MIPS' 16-bits), and, of course, the opcode is on the right (and 7 bits not MIPS' 6 bits).  (There are also no wasted zeros.)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Erik Eidt
  • 23,049
  • 2
  • 29
  • 53
  • Hi, Sorry but this didn't answer any of my question Plus this is how my professor defined it –  Dec 13 '20 at 18:43
  • 2
    I can't change reality. – Erik Eidt Dec 13 '20 at 18:44
  • @ErikEidt: Just to be clear, the asm syntax doesn't tell us it's MIPS; it's the machine code layout and immediate width that do. – Peter Cordes Dec 13 '20 at 18:53
  • @PeterCordes, yes, that's why I am showing the RISC V counterpart. – Erik Eidt Dec 13 '20 at 18:56
  • I had in mind making that early sentence of the answer say that. I made an edit to show what I meant. When I first read your answer, it seemed like that sentence could be read as saying LUI was exclusively a MIPS instruction, not just that definition. (Especially since I skipped over the quoted part of the question). Just a minor point :P – Peter Cordes Dec 13 '20 at 19:02
  • @PeterCordes, yes, thanks, that edit is an improvement. – Erik Eidt Dec 13 '20 at 19:02
  • How did you know that s0 becomes 0x12340000? what if 1234 in hex goes beyond 16 bits in binary? you should only take one half... –  Dec 13 '20 at 19:10
  • @albert: 4 hex digits *always* fit in 16 bits. Each hex digit represents 4 bits. There's no way to encode a wider value in MIPS machine code; if you tried to do `lui $t0, 0x12345`, your assembler would tell you it's not encodeable. – Peter Cordes Dec 13 '20 at 19:13
  • "4 hex digits always fit in 16 bits" how did you know that, it the first time I hear of such a claim... I am so confused –  Dec 13 '20 at 19:22
  • @ErikEidt you made a mistake, the definitions says the lower bits are zeroes not the uppers which means 00001234 and not 12340000 –  Dec 13 '20 at 19:24
  • Which bits do you think are the lower bits? 0x12340000 has the lower 16 bits as zero. – Erik Eidt Dec 13 '20 at 19:48
  • @ErikEidt why the right is lower? who said so? If you look at the image is say imm goes to the upper part which is the right one –  Dec 13 '20 at 20:01
  • "lower bits" are less significant, sorry if that's not clear, but that is how it is. In decimal the number 123 has 3 as the lowest digit. Try to appreciate the difference between a conceptual immediate and the instruction encoding, which are somewhat unrelated concepts. – Erik Eidt Dec 13 '20 at 20:03
1

Before even attempting to read or write assembly language you need to get the documentation. For the instruction set you are using, is this a MIPS question (the image you posted) or a risc-v question as documented in the text of the title, etc?

Assuming risc-v go to risc-v.org and follow the links to the documentation, they have made it extremely easy to find.

LUI in risc-v is defined as such

33222222222211111111 11
10987654321098765432 10987 6543210
     imm[31:12]        rd   opcode

bits 31:12 of the instruction are the immediate
bits 11:7  of the instruction are the destination register
bits 6:0   of the instruction are the opcode

Obviously every instruction needs some bits for the processor to decode to know what instruction it is, only one can have the pattern zero so there are non-zero bits in most opcodes. Likewise you need a destination register, where the value get stored, encoded in the instruction as well.

LUI (load upper immediate) is used to build 32-bit constants and uses the U-type format. LUI places the U-immediate value in the top 20 bits of the destination register rd, filling in the lowest 12 bits with zeros.

Painfully obvious how this instruction works.

I will admit the risc-v documentation could have been done better, finding the opcode...0b0110111 or 0x37.

Not sure what the confusion is about how humans read numbers

0b0110111 = 0x37 = 067 (octal) = 55 (decimal)

these all describe the same value, which all describe the same bit pattern in the instruction.

so that means they could have just put that in the instruction definition like everyone else.

33222222222211111111 11
10987654321098765432 10987 6543210
     imm[31:12]        rd  0110111

So knowing that we can for example construct

.word 0x12345137

assemble then disassemble

Disassembly of section .text:

00000000 <.text>:
   0:   12345137            lui x2,0x12345

Okay, so let's try that forward:

.word 0x12345137
lui x2,0x12345

assemble and disassemble

Disassembly of section .text:

00000000 <.text>:
   0:   12345137            lui x2,0x12345
   4:   12345137            lui x2,0x12345

So there we go, instruction encoding solved.

LUI (load upper immediate) is used to build 32-bit constants and uses the U-type format. LUI places the U-immediate value in the top 20 bits of the destination register rd, filling in the lowest 12 bits with zeros.

So this was quite clear the 32 bit constant is in this case

0x12345000

gets stored in register x2 in this case.

Both the encoding and the operation of all of the instructions are defined, most should be easy to understand. The encoding is very straight forward and easy to understand.

Now if this was a MIPS question and not a risc-v question then in this case it is as equally easy to understand. The 16 bit immediate goes into bits 31:16 of the constant being constructed with bits 15:0 all being zeros and that constant being stored in the register encoded in the instruction. Along with an opcode so the processor can know what instruction this is.

halfer
  • 19,824
  • 17
  • 99
  • 186
old_timer
  • 69,149
  • 8
  • 89
  • 168
0

First of all, this is MIPS's version of lui, not RISC-V. They're similar ISAs, but definitely have different machine code, and different sizes of immediates.

See @Erik's answer for parts of your question and some necessary background, but yes there is a problem with the wording in the image.

"loads the lower halfword of the immediate" what does this mean? I think if we have 32 bits in imm then it loads the first 16 am I right?

This part of the image explains it badly; arguably even wrong. The diagram in the image shows the entire 32-bit instruction word, with its various fields. The immediate is the low half-word of the instruction word. The "lower halfword of the immediate" is the whole immediate, and there is no upper half-word. Phrasing it that way makes zero sense and is highly misleading.

The value placed in the destination register is the immediate left-shifted by 16, so yes the immediate is placed in the upper half-word of the 32-bit register.


Given the following command: lui S0, 0x1234

Single-step it in MARS and see what happens. Look at the register value in the debugger. It's 0x1234 << 16.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • How could be the value of register in hex? a value is always in binary... –  Dec 13 '20 at 19:25
  • 1
    @albert: Right, the register doesn't hold a string of ASCII hex digits, and I never said it would. I used hex in my answer to represent the numeric values, exactly like you're doing in the asm source with `lui $s0, 0x1234`. Hex is a convenient way to write numbers in a text format, for other humans to read, in a base that's congruent with base 2 so every 4-bit chunk maps to one hex digit. (vs. with decimal, the value for every digit depends on all the high bits). I could have written `0b0000'0000'0000'0000'0001'0010'0011'0100` (using C++-style separators) but that would be harder to read. – Peter Cordes Dec 13 '20 at 19:44
  • 1
    @albert: I could also have written `4660 << 16` or `305397760` or `0x12340000` because those expressions all represent the same number. Obviously I didn't include a raw 32-bit integer into the text of my answer; it's text so I can only use printable characters. The standard way to represent numbers in our writing system is with a place-value system using some base, with most-significant digit on the left. – Peter Cordes Dec 13 '20 at 19:45
  • got that but isn't lui expecting 32 bit input? when we write 0x1234 it's 16 bit so... –  Dec 13 '20 at 20:03
  • @albert: The whole instruction is 32 bits; the immediate is the low half of that. That's the main point of my answer. Go read it again, especially the "*The immediate is the low half-word of the instruction word*" part. And look at the diagram which labels the low 16 bits as "imm". The diagram is correct, it's only the text in the image that's misleading. `lui` is an I-type instruction that has a 16-bit immediate. – Peter Cordes Dec 13 '20 at 20:06