2

I don't have much experience in 8086 assembly and I like to know what happens in the program if you don't write the starting label (start:) and the end of that label
(end start) (labels that surround the executing code)?

So my question is are this labels necessary for the execution, does the code access some addresses that is not supposed to when these labels are excluded and are these labels that surround the executing code the same as start(=='{') and the end(=='}') of main() in java class?

*Additional information and results

I was writing a program for printing the numbers 1-5 which are contained in an array. I tried it with and without adding the labels and here is the results:

;assembly for printing an array of the integers 1-5

;data segment
data segment 
    NIZA db 1,2,3,4,5
    ends

;code segment
code segment  
    start: ;the "start:" label

    ;setting ds and es         
    mov ax,data
    mov ds,ax
    mov es,ax


    mov bx,OFFSET NIZA
    mov cx,5
    pecatenje_na_niza:
    mov dl,[bx] 
    add dx,48d
    mov ah,2
    int 21h
    inc bx
    loop pecatenje_na_niza
    mov ah,1
    int 21h 
    mov ah,4ch
    int 21h  

    end start ;the "end start" label

ends 

1) start: and end start included:

  • The program runs as it supposed to and the output is all the elements of the array printed.

2) start: and end start not included (the same code,but the labels excluded):

  • When the program starts there are these few lines that execute, that don't in the one where I include start: and end start:
    (I can't find a way to copy from the emulator, so I'm gonna paste a screenshot) extra lines added when <code>start:</code> and <code>end start</code> are not included
    and here are the values of the array NIZA in the emulator before and after executing this lines of code:

  • Before: Array values before executing the starting code

  • After: Array values after executing the starting code

And in the end the output is all zeros.

  • Output: Result after the execution of the whole program

The printing is as it is because of this line add dx,48d, so that's why all it prints is 00000. By the way, the DX resets every time mov dl,[bx] executes .

That's all I could understand and find for now.

borceste
  • 47
  • 8

2 Answers2

4

If you don't include start emu8086 will apparently default to starting at the beginning. Since you put your data there, your instructions are just your NIZA array values interpreted as code.

 1 00000000 0102                    add [bp+si], ax
 2 00000002 0304                    add ax, [si]
 3 00000004 050000                  add ax, strict word 0
 4 00000007 0000                    add [bx+si], al

You can see your bytes 1-5 then some zero padding. The cpu doesn't care that you intended these to be data, it will try to decode them as instructions if they are in the execution path.

Jester
  • 56,577
  • 4
  • 81
  • 125
2

Depending on the assembler used, and I am not familiar with emu8086 as an assembler but more as an emulator, the assembler needs to be told where the starting point or entry point of the program is. For example, in C, this would be the

int main(int argc, char *argv[]) {

line.

All executable files need to know where their entry point is, so after the operating system has loaded it into memory, the correct entry point gains control.

In the early DOS days, this entry point was at an offset of 100h from the code segment. If you did not specify a start address, this 100h offset was assumed. As with your code, and DOS .EXE files, an offset of 00h was/is assumed. Hence the outcome Jester expresses above.

Assemblers should allow you to indicate the starting point using different techniques. It looks like the assembler you are using uses

end start

Others may use very similar techniques.

However, be careful with the 'end' keyword. Some assemblers see this keyword and ignore anything after it within this file. Therefore, if you place anything after

end start ;the "end start" label

in your source code shown above, the assembler may ignore it.

fysnet
  • 421
  • 3
  • 5
  • The thing, that I forgot to mention, is that I have tried the combination of adding the ```start:``` label and excluding the ```end start``` one and the same thing happened as if I have excluded them both in the same time. Seems like they go in-hand all the time. – borceste Nov 17 '19 at 19:43
  • 2
    the start: label is merely a place marker in the code. This simply provides a way for the code to extract the offset from the current segment. The start: label can be used without the 'end start' without error, but it cannot be the opposite. For example, an 'end start' without the start: label should provide and error. – fysnet Nov 17 '19 at 20:14
  • @fysnet: You're correct: in MASM / TASM syntax, `end` does normally make the assembler stop reading input. That's why it's called `end` instead of e.g. FASM's `entry _start` which just sets the entry-point without also having other weird effects. In other toolchains, it's the *linker* that needs to know the entry point because assembling + linking are separate, and object-file metadata doesn't store an entry-point. e.g. in GNU binutils `ld -e _start` is the default, otherwise the top of the `.text` section with a warning, pretty much like what Jester found emu8086 does with no `end`. – Peter Cordes Nov 17 '19 at 21:44
  • @fysnet is there some interesting article you can suggest/link about all of this? – borceste Nov 18 '19 at 19:14
  • The assembler that you are using should have documentation on this. Check your assembler's documentation. For example, with a quick Google search, MASM's is at https://learn.microsoft.com/en-us/cpp/assembler/masm/end-masm?view=vs-2019. TASM should have similar documentation. NASM surely does. – fysnet Nov 18 '19 at 23:19