4

I just started programming in assembly so I am a beginner.

To practice, I am trying to rewrite a basic libc in assembly (NASM Intel syntax).

But I'm stuck on the strcmp function:

;; Compare two C-style NUL-terminated strings
;; Inputs   :  ESI = address of s1, EDI = address of s2
;; Outputs  :  EAX = return an integer less than, equal to, or greater than zero if s1 is found, respectively, to be less than, to match, or be greater than s2
strcmp:
    call strlen
    mov ecx, eax ; ecx = length of the string in esi

    repe cmpsb
    sub esi, edi ; result = *esi - *edi
    
    mov eax, esi
    
    ret

For me, it should work like this:

s1 db 'Hello World', 0
s2 db 'Hello Stack', 0

After the repe cmpsb instruction, ESI should be equal to [s1 + 7] and EDI to [s2 + 7].

So I just have to do EAX = 'W' - 'S' = 87 - 83 = 4

The problem is, it doesn't work. I think the problem is that when I execute this instruction:

sub esi, edi ; result = *esi - *edi

I don't think that it means: subtract the characters pointed to by EDI and ESI.

Does anyone have an idea on how I can do this?

Sep Roland
  • 33,889
  • 7
  • 43
  • 76
  • 4
    Have you ensured that `strlen` preserves the contents of `edi` and `esi`? Also, note that `sub esi, edi` computes the difference of `edi` and `esi`, not the difference of the characters these two point to. – fuz Aug 01 '20 at 11:38
  • Also, what calling convention do you follow? This looks a lot like the amd64 SysV ABI, but you seem to be writing 32 bit code for which this ABI is usually not used. – fuz Aug 01 '20 at 11:44
  • Thanks for answering me, I'm not sure. I am compiling for 32 bit Linux systems. So it's probably System V ABI i386. Do you have any idea on how I can compute the difference of the characters rather than the registers? (and yes strlen preserves the content of EDI and ESI) –  Aug 01 '20 at 11:50
  • If you are writing 32 bit code, the standard ABI passes arguments through the stack. However, this is only relevant if you intend to interface with C code. If all code is written by yourself in assembly, you can use whatever ABI you want. Let me write an answer to your question then. – fuz Aug 01 '20 at 11:54

1 Answers1

3

Your code is almost correct. There are three issues left:

  • you should not assume that strcmp preserves the contents of esi and edi unless you have explicitly specified that it does so. It's very easy to later change strcmp and then forget about the requirement, leading to all sorts of annoying problems.
  • instead of returning the difference between *edi and *esi, you return the difference between edi and esi. Also, as cmpsb advances esi and edi by one, the last characters compared are found at edi[-1] and esi[-1].
  • you have an off-by-one error: strlen returns the number of characters that preceed the NUL byte, but you do need to compare the NUL byte as well. Otherwise, you'll end up finding that two strings are equal if one is a prefix of the other since you never check that the second string actually ends when the first one does.

To fix the first issue, I recommend you to save and restore esi and edi around the call to strlen. The easiest way to do so is to push them on the stack:

    push esi             ; save ESI and EDI
    push edi
    call strlen          ; compute the string length
    pop  edi             ; restore ESI and EDI
    pop  esi

The second issue is fixed by loading the characters to compare from memory, computing the difference, and then storing the result to eax:

    movzx eax, byte [esi-1] ; load byte from ESI[-1] and zero extend into EAX
    movzx ecx, byte [edi-1] ; load byte from EDI[-1] and zero extend into ECX
    sub   eax, ecx          ; compute the difference

This also addresses the third issue by using the correct offsets right away. Note that movzx is needed here instead of the slightly simpler

    mov   al, [esi-1]       ; load byte from ESI[-1] into AL
    sub   al, [edi-1]       ; subtract EDI[-1] from AL

since we want the result of the subtraction to be correctly sign-extended into eax.

fuz
  • 88,405
  • 25
  • 200
  • 352
  • I really thank you for taking the time to help me, it works perfectly now, again a big thank you ! –  Aug 01 '20 at 12:14