4

I am trying to understand what will be the content of AX register in the following question, I don't understand how I can know what [5000h] or [DI] is in the examples.

The state of the registers and memory are defined as:

CS=3000 [53000]=BBBB [33000]=6666 [13000]=1111
DS=1000 [54000]=CCCC [34000]=7777 [14000]=2222
SS=5000 [55000]=DDDD [35000]=8888 [15000]=3333
DI=7000 [56000]=EEEE [36000]=9999 [16000]=4444
BP=4000 [57000]=FFFF [37000]=AAAA [17000]=5555

What is the value in AX for each of these instructions

  • MOV AX, [DI]
  • MOV AX, [5000h]
  • MOV AX, [BP+2000h]
  • LEA AX, [BP+1000h]
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
Amitay Tsinis
  • 320
  • 3
  • 12

2 Answers2

7

This is an academic question, but it touches on a number of concepts of real mode 20-bit segment:offset addressing. All memory addresses in real mode are always made up of two parts - a segment and an offset. The two parts are combined together to generate a physical address with the formula:

Physical Address = segment * 16 + offset

or

Physical Address = segment << 4 + offset

Both yield the same result as shifting something left 4 bits is the same as multiplying by 16 decimal (or 10h hexadecimal).

You will find that instructions may specify a segment explicitly and when it isn't specified there is always an implicit one. A general rule is that if a memory address uses BP then the memory operand is relative to the SS segment, otherwise it is relative to the DS segment.

An LEA instruction doesn't actually access physical memory, it simply computes the effective address of the memory operand and loads the address in a register. With LEA the segment doesn't come into play. A MOV instruction with a memory operand will move the contents of a memory operand to/from a register.


All the values given in your questions are given in hexadecimal. To answer your questions:

  • MOV AX, [DI] is the same as MOV AX, [DS:DI] since the implied segment is DS. In the question DS=1000h. DI=7000h . The offset is DI. Using the formula segment<<4 + offset we get physical address 1000h<<4+7000h = 10000h+7000h=17000h. The question states memory address [17000]=5555 so the value moved to AX is 5555h.

  • MOV AX, [5000h] is the same as MOV AX, [DS:5000h] since the implied segment is DS. In the question DS=1000h. The offset is 5000h . Using the formula segment<<4 + offset we get physical address 1000h<<4+5000h = 10000h+5000h=15000h. The question states memory address [15000]=3333 so the value moved to AX is 3333h.

  • MOV AX, [BP+2000h] is the same as MOV AX, [SS:BP+2000h] since the implied segment is SS. In the question SS=5000h and BP=4000h. The offset is BP+2000h . Using the formula segment<<4 + offset we get physical address 5000h<<4+(4000h+2000h) = 50000h+(4000h+2000h)=56000h. The question states memory address [56000]=EEEE so the value moved to AX is EEEEh.

  • LEA AX, [BP+1000h] : The segment doesn't come into play since it is an LEA instruction. In the question BP=4000h. The offset is BP+1000h=4000h+1000h = 5000h. Since LEA only computes and stores the address in a register the value in AX will be 5000h.

Community
  • 1
  • 1
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • +1 This answer is right where mine is wrong. You can *accept* it by checking the green mark below the upvote. The reason this answer is right is that, unlike me, @MichaelPetch remembers the details of 8086 as it was used 35 years ago. My answer is in the modern 64-bt era, which is wrong for your purpose. – thb Jan 28 '19 at 19:58
  • 1
    @thb I honestly don't see where your answer is wrong. Yes, Michael elaborates more on the instructions that ultimately have no effect on the outcome, but other than that, you correctly stated that `LEA` just does an address calculation using the offset and the result is `5000h`. – Jester Jan 28 '19 at 21:08
  • 1
    @Jester: thank you; I had not known that. Unfortunately, if my answer was right, it was accidentally right. I really don't know the old 16-bit 8086 mode. I was thinking 64 bits. – thb Jan 28 '19 at 21:14
  • @jester : in the assignment all those instructions are independent.It would have been better had they been labelled 1,2,3,4. It is just how the person worded the question that makes them look like a seqeunce of instructions. – Michael Petch Jan 28 '19 at 21:18
  • 1
    @MichaelPetch I see. Still, thb's answer is not wrong, just partial then. He can keep my upvote :) Especially since the question title says "the program" which strongly implies this is a single program with the given sequence of instructions. I interpreted it that way too. – Jester Jan 28 '19 at 21:19
  • 1
    @jester : I didn't say his answer was wrong. To the contrary he only happened to give a partial answer. I think what thb is saying that if he had applied his knowledge to all of the instructions he may not have arrived at the proper result for each one. – Michael Petch Jan 28 '19 at 21:22
2

[My answer is left here for reference but I withdraw it. From the information you have given, I gather that your x86 processor is operating in privileged 8086 compatibility mode, as during boot loading. I have no experience at writing bootloaders, unfortunately.]

Old data in a register is overwritten when new data arrives. Therefore, only the LEA instruction affects this result.

Moreover, the LEA instruction is special: it does not dereference the address it computes. In your example, because BP contains 4000h, the address the LEA computes is 4000h + 1000h == 5000h. The last address is not used, but is merely stored for future use in the AX register.

Therefore, at the end of this code's execution, the register AX will hold the value 5000h.

To clarify, I did not say that the register AX will hold a copy of the datum stored in memory at address 5000h. Rather, I said something simpler: the register AX will hold the value 5000h.

thb
  • 13,796
  • 3
  • 40
  • 68
  • Yes, I get it now. They still use this old 8086 mode in Linux bootloaders. I have not encountered this mode elsewhere in a long time (and, anyway, back then, I was into Z80, 6502 and 68000, not 8086, admittedly), but who knows? The answer of @MichaelPetch looks good, at any rate. – thb Jan 28 '19 at 20:01
  • 1
    Apparently some schools teach assembly language using emu8086 and DOS "system calls". For some reason they think that older means simpler, and thus teach students real-mode segmentation because of a historical accident. (That the CPU which caught on for our desktops happened to have its first generation during the transition from 16-bit address spaces to larger ones.) So x86-16 real-mode is *vastly* over-represented in SO questions. There are also people who want to "write their own OS" or be more "low level", but think that BIOS `int 0x10` calls are lower level than Linux system calls. – Peter Cordes Jan 29 '19 at 06:36
  • 1
    I learned x86 with 32-bit and 64-bit asm in user-space on Linux, with a flat memory model, and only later learned about the complexities of segmentation. Much easier to learn that later once you understand the basics, and understand the problem that segmentation was designed to solve. – Peter Cordes Jan 29 '19 at 06:39