0

I have to write a program which copy an array in other array, using x86 assembler

The original code is written in MSDOS' TASM for 8086 processor, but I want port this to Linux NASM using i386 processor

The code in TASM is this:

.MODEL SMALL

.DATA

    TABLE_A DB 10, 5, 1
    TABLE_B DB 0, 0, 0

.CODE

    MOV AX, SEG TABLE_B
    MOV DS, AX

    MOV SI, 0

    LOOP:
        MOV AL, TABLE_A[SI]
        MOV TABLE_B[SI], AL

        INC SI
        CMP SI, 2
    JBE LOOP


    MOV AH, 4Ch
    INT 21h

END

I'm trying to rewrite this in nasm, but I don't get to sit in the correct array position, similar to TABLE_A[SI] instruction

How can I do it?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
AlmuHS
  • 387
  • 6
  • 13
  • 2
    technically, just ignore the array business. it's just a chunk of memory. all you need to know is where the array starts and how long it is - then copy all the bytes between two those locations into the new one. – Marc B Oct 26 '16 at 21:16

4 Answers4

1

The final code in nasm is this

section .text
global _start
cpu 386

_start:

MOV ESI, TABLE_A
MOV EDI, TABLE_B
MOV CX, 3

COPY_LOOP:      
    MOV AL, [ESI]
    MOV [EDI], AL

    INC SI
    INC DI
LOOP COPY_LOOP

MOV AX,1
INT 80h

section .data
TABLE_A DB 10, 5, 1
TABLE_B DB 0, 0, 0
AlmuHS
  • 387
  • 6
  • 13
  • 1
    and if you would make sure the `ds` and `es` point to .data segment (and DF=DirectionFlag is zero), then you can replace whole COPY_LOOP with single rep-prefixed instruction `rep movsb`. – Ped7g Oct 27 '16 at 10:46
  • How could I do it? – AlmuHS Oct 28 '16 at 23:00
  • `mov ax,1` is potentially broken; use `mov eax,1` to set the call number without leaving any high garbage in the high 2 bytes. Same for CX; your loop uses ECX but you only set the low 2 bytes; you could loop some multiple of 2^16 more times than you meant. – Peter Cordes Apr 24 '21 at 08:20
  • re: `rep movsb`: DS and ES are already set correctly under Linux, you can literally just run `rep movsb` – Peter Cordes Apr 24 '21 at 08:21
1

How could I do it?

(question from comments on self-answer)

Well, first you read Instruction reference guide to understand what the instruction does, and then you can use it, if it fits your purpose. This is the important step, keep re-reading instruction details every so often, to verify it does modify registers and flags in a way you expect it. Especially if in debugger you see the CPU state of change you didn't expect.

As you are in linux, the ds/es segment registers are very likely already set to reasonably values (covering .data section), so after setting eSi to Source address, eDi to Destination address, and eCx to Count, you write instead of COPY_LOOP: just rep movsb ... and then exit trough int 80h (eax=1). (notice the emphasized letters in register names, Intel picked those intentionally to make it easy to recall)

BTW, just now I noticed, you wrote in your code sort of bugs:

  1. inc si/di should be inc esi/edi, because you use esi/edi to address. If you would be copying array over 64k memory boundary, inc si would wrap around on it.

  2. set ecx to 3, in 32b mode the loop instruction does use whole 32b ecx, not 16b part cx only. If the code ahead of copy would use some large number in ecx setting some of upper 16 bits, your loop would copy many more bytes than only 3.

  3. ahead of calling int 80h again you must set whole eax with the function number, otherwise you risk to have some garbage in upper 16 bits of eax from previous code, requesting invalid function.

So after applying these your code may look like this:

section .text
global _start
cpu 386

_start:
    MOV ESI, TABLE_A
    MOV EDI, TABLE_B
    MOV ECX, 3
    REP MOVSB  ; copy ECX bytes from DS:ESI to ES:EDI

    MOV EAX,1  ; call sys_exit, again FIXED to EAX!
    INT 80h

section .data

TABLE_A DB 10, 5, 1
TABLE_B DB 0, 0, 0

If you did read the docs about registers, you should already understand what is difference between eax and ax. In Linux you are in 32b mode (when you link the binary as 32b elf, nowadays the 64b may be default on 64b system, which differs a bit from 32b mode), so by default use the 32b register variants. Unless you really want the 16b/8b variant for particular reason, and you make sure the code doesn't work later with 32b register while you set only less of it (like loop, rep movsb and int 80h do).

Also it makes the code usually faster, as using 16b ax in 32b mode requires additional opcode byte ahead of instruction, for example mov eax,ebx is 2 bytes opcode 89 D8, mov ax,bx is 3 bytes opcode 66 89 D8.

Ped7g
  • 16,236
  • 3
  • 26
  • 63
-1

In response to marc

I tried this form, without successful result:

MOV SI, 0
MOV AX, 0

LOOP:       
    MOV AX, [TABLE_A + SI]
    MOV [TABLE_B + SI], AX

    INC SI
    CMP SI, 2
JBE LOOP
AlmuHS
  • 387
  • 6
  • 13
  • If the result is not successful, what is the result? – Jose Manuel Abarca Rodríguez Oct 26 '16 at 21:25
  • at first, ld shows that error: relocation truncated to fit: R_386_16 against `.data' When I change SI to ESI, it assemble, but with strange result in the debug. In the first iteration, the EAX value is 0000050a In the second iteration the value is 00000105 In the third iteration this value is 00000a01 – AlmuHS Oct 26 '16 at 21:35
  • You can't use a 16-bit register in a 32-bit addressing mode. If your array is constant-size, why not just copy 4 bytes with a dword load/store, instead of looping a byte at a time? If you're going to use this for anything, copying single bytes is slow. – Peter Cordes Oct 26 '16 at 22:28
  • I have just solved my problem, with the @JoseManuelAbarcaRodríguez solution – AlmuHS Oct 26 '16 at 22:31
  • 1
    So why did you change `al` into `ax`? You are loading two bytes at once, while original TASM code loads only single byte. Then you do `inc si` moving the offset only by 1 byte, while you already read/store two bytes (with `ax`). Learn how `rax/eax/ax/ah:al` differ and what are they bit size (and how they partially share the bits, as it's the same register in CPU). See http://stackoverflow.com/documentation/x86/2122/register-fundamentals#t=201610271040083101313 – Ped7g Oct 27 '16 at 10:39
-1

Use pointers (SI, DI) to the arrays and CX as counter :

MOV SI, Table_A     ;POINTER TO TABLE_A.
MOV DI, Table_B     ;POINTER TO TABLE_B.
MOV CX, 3           ;ARRAY LENGTH.
REPEAT:       
    MOV AL, [SI]
    MOV [DI], AL
    INC SI
    INC DI
    LOOP REPEAT     ;CX-1. IF CX>0 JUMP TO REPEAT.