4

here is the output

The first row is printing junks.

I tried to switch things offset,index etc. But the first line is always wrong regardless of what the string is.

    mov AX, 0b800h
        mov ES, AX

    nums db '  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19   '  
    numsend label byte
;first row

    MOV SI,OFFSET nums

    MOV DI,160*4 +2 ;1st row,1st column, 2 cells per char
    MOV AH, 07h
    MOV CX,5*3;2 chars per digit, and 5 digit
row1:
MOV AL,[SI]
MOV ES:[DI],AX
INC SI
ADD DI,2
LOOP row1

;second row

    MOV SI,OFFSET nums+15 ;point to the beginning of '_6 _7 _8 _9 10' from nums array

    MOV DI,160*5 +2 ;2nd row,1st column, 2 cells per char
    MOV AH, 07h
    MOV CX,5*3 ;2 chars per digit, and 4 digit
row2:
MOV AL,[SI]
MOV ES:[DI],AX
INC SI
ADD DI,2
LOOP row2

;third row

    MOV SI,OFFSET nums+30 ;point to the beginning of '11 12 13 14 15' from nums array

    MOV DI,160*6 + 2*1 ;3rd row,1stcolumn,2 cells per char
    MOV AH, 07h
    MOV CX,5*3 ;2 chars per digit, and 4 digit
row3:
MOV AL,[SI]
MOV ES:[DI],AX
INC SI
ADD DI,2
LOOP row3

;fourth row

    MOV SI,OFFSET nums+45 ;point to the beginning of ' 16 17 18 19   ' from nums array

    MOV DI,160*7 + 2*1 ;4th row,1stcolumn,2 cells per char
    MOV AH, 07h
    MOV CX,5*3 ;2 chars per digit, and 4 digit
row4:
    MOV AL,[SI]
    MOV ES:[DI],AX
    INC SI
    ADD DI,2
    LOOP row4

I expected:

  1  2  3  4  5
  6  7  8  9 10
 11 12 13 14 15
 16 17 18 19

But i always get: random ascii values (for the first line)

  6  7  8  9 10
 11 12 13 14 15
 16 17 18 19
fuz
  • 88,405
  • 25
  • 200
  • 352
nomad
  • 43
  • 4
  • 2
    Thank you for clearly showing what the desired and actual output is. – fuz Apr 11 '19 at 06:47
  • 1
    You have `nums db` between `mov AX, 0b800h` / `mov ES, AX` and the rest of the code!! Single-step your code with a debugger, you should see those ASCII bytes being decoded as instructions. Presumably that corrupts the first space or something. Is that really a [mcve]? I don't see any ORG directive, or anything about the context this runs in. MBR bootloader, DOS .exe, DOS .com? – Peter Cordes Apr 11 '19 at 07:12

1 Answers1

4

The problem is that you execute data as code:

In the computer, your program will be stored in the RAM. This is true for both code and data. RAM only stores numbers in the range 0 to 255. The CPU cannot distinguish between code and data.

The data db ' 1 2 3 4 ... is stored as 0x20 0x20 ... and the instruction and [bx+si],ah is also stored as 0x20 0x20 ....

Because there is no jmp instruction after mov ES, AX, the CPU assumes that the bytes after mov es,ax represent an instruction to be executed (0x20 0x20 = and [bx+si],ah) and not data.

It is probable that the instructions executed by the CPU will crash the program in such cases. In your case this seems not to happen.

However, in your case the last byte of the data is 0x20. This is not a full x86 instruction and it is followed by the instruction mov si,offset nums which is stored as 0xbe xx xx. The CPU will interpret this as 0x20 0xbe xx xx which is and [bp+nums],bh.

Therefore the si register will not be set.


Assembling the first part of the file with NASM (after porting the syntax) to a flat binary and disassembling with ndisasm, we get:

address    hexdump           disassembly

00000000  B800B8            mov ax,0xb800
00000003  8EC0              mov es,ax
00000005  2020              and [bx+si],ah    ; two ASCII spaces = 0x2020
00000007  3120              xor [bx+si],sp
00000009  2032              and [bp+si],dh
0000000B  2020              and [bx+si],ah
0000000D  3320              xor sp,[bx+si]
0000000F  2034              and [si],dh
00000011  2020              and [bx+si],ah
00000013  352020            xor ax,0x2020
00000016  362020            and [ss:bx+si],ah
00000019  37                aaa
0000001A  2020              and [bx+si],ah
0000001C  3820              cmp [bx+si],ah
0000001E  2039              and [bx+di],bh
00000020  2031              and [bx+di],dh
00000022  3020              xor [bx+si],ah
00000024  3131              xor [bx+di],si
00000026  2031              and [bx+di],dh
00000028  3220              xor ah,[bx+si]
0000002A  3133              xor [bp+di],si
0000002C  2031              and [bx+di],dh
0000002E  3420              xor al,0x20
00000030  3135              xor [di],si
00000032  2031              and [bx+di],dh
00000034  362031            and [ss:bx+di],dh
00000037  37                aaa
00000038  2031              and [bx+di],dh
0000003A  3820              cmp [bx+si],ah
0000003C  3139              xor [bx+di],di
0000003E  2020              and [bx+si],ah
00000040  20BE0500          and [bp+0x5],bh     ; 0x20 is the last space,
        ; BE imm16 is the mov-to-SI
00000044  BF8202            mov di,0x282        ; decoding happens to line up with this instruction
00000047  B407              mov ah,0x7
00000049  B90F00            mov cx,0xf

So that's a lot of memory-destination instructions, but apparently BX, SI, DI, BP, and their various combinations, weren't pointing anywhere it was problematic to destroy.

x86 machine code uses most of the available coding space, so it's normal for data, accidentally decoded as instruction, to not hit any illegal instructions. (Especially in 16 / 32-bit mode.)

Sep Roland
  • 33,889
  • 7
  • 43
  • 76
Martin Rosenau
  • 17,897
  • 3
  • 19
  • 38