converting C code into assembly

Question

I'm practicing what I have learned in assembly by converting simple C codes into assembly code.

This is the code in C

int square_me (int val)

{

return (val* val)

}

Here's my code converted into Assembly (I declared val and initialized it into 4)

    val dw 4    ; declaration and initialization of the val variable (in main)

    push val    ; push val onto the stack so that you still have a copy of the original value of val incase i'll be needing it in some methods or functions (in main)

    call square_me  ; calling the function square_me, (in main)

    push EIP   ; pushing the value of EIP onto the stack so the code knows where to go back after the function

    push EBP    ; creating the stack frame for the function

    mov EBP, ESP    ; same with the one above

    push val     ; save the value of val so that ill have a copy of the original value of val in case I made some changes to it

    mul val, val   ; multiply to val to itself, and save it to val

    mov eax, val  ; move the value of val to eax

    pop val    ; pop the original value of val from the stack

    mov ESP, EBP   ; to restore the stack frame

    pop EBP    ; same with the one above

    leave     

    ret     ; return to the caller

But when I looked at the answer written in the document, it is far different from mine, here's how he converted it into assembly

Push EBP

mov EBP, ESP

mov EAX, DWORD PTR [EBP + 8]

XOR EDX, EDX

mov EBX, EAX

MUL EBX

MOV ESP, EBP

POP EBP

Ret

Question 1: Did I correctly convert the C code seen above to Assembly?

Question 2: What is this for

mov EAX, DWORD PTR [EBP + 8]

Question 3: Why does he need to do this? EDX wasnt use after that statement so what's the point?

XOR EDX, E

Any idea?

Thanks!

score 3 · Accepted Answer · answered May 20 '13 at 12:03

There are two parts to your code which have just been jammed together in your example. Hopefully that's just a formatting mistake. Otherwise the code makes no sense.

First, the part that calls the function. This assumedly would be a snippet from some larger program that does something with the result and eventually exits.

val dw 4
push val
call square_me
push EIP

The problems here are:

push EIP isn't a valid instruction. You can't push the EIP register, and even if you could, why would you want to?
For this kind of calling format, it's the callers responsibility to clean up the stack so you need to follow that call with something like add ESP,4 to clean up the val that you pushed onto the stack.

Then for the function itself:

push EBP
mov EBP, ESP

That bit is good - you're setting up the stack frame. But what is the purpose of this next line?

push val

First you're accessing the global variable val rather than the parameter passed to the function. If you want to access parameters by name, I believe you can set that up as part of the PROC directive in MASM, otherwise you need to use [ebp+8] to get to the first parameter on the stack.

Regardless of how you access it, though, there is no need to save the value on the stack here. It's already in memory - you're not going to lose it.

As for the multiply, this isn't a valid instruction:

mul val, val

You can't multiply two memory references together - you can only multiply something with the EAX register. So you need to load your value into eax first. Something like this:

mov EAX,[EBP+8]
mul EAX,EAX

At this point the result will be in EAX (and EDX - but we don't care about the high word), which is what you want for the function to return the answer. But then you are doing this:

mov eax, val

That's just going to overwrite the multiplication result that you just calculated.

pop val

This pop assumedly goes with the initial push val, and can likewise be removed.

mov ESP, EBP
pop EBP

Those are fine - that just cleans up stack frame. But then you don't need this:

leave

The leave instruction is essentially identical to mov esp,ebp / pop ebp. You don't need both of them.

In short, your function should look something more like this:

push EBP
mov EBP, ESP
mov EAX, [EBP + 8]
mul EAX, EAX
mov ESP, EBP
pop EBP
ret

As for the answer given in the document, that wasn't great either. There is no reason for setting EDX to zero. As ady has already said, the mul instruction is going to override EDX anyway.

I could understand maybe setting it to zero after the mul instruction, because the result is technically incorrect given that you are using an unsigned multiplication on signed integers. It just doesn't matter because you are going to be discarding the high word (EDX) anyway. In my opinion, it would have been clearer to use imul instead.

I also don't see why they felt the need to move the parameter into both EBX and EAX when they could have just multiplied EAX with itself. Perhaps there is some performance reason for doing that, but I doubt it.

Regarding the use of [EBP+8] to access the function parameter, you need to understand what your stack looks like inside the function.

When calling the function, you first push val, then the call pushes its return address onto the stack, and then inside the function you push EBP. So at that point, your stack would look like this:

           val             // push val
           return address  // call square_me 
STACK TOP: EBP             // push ebp

Now when you do mov EBP,ESP you will have set EBP to point to the top of the stack. So at [EBP+0] you have the saved copy of EBP, at [EBP+4] you have the return address of the caller, and at [EBP+8] you have the val parameter that was passed in the call.

Wow! This is really helpful, seriously!! Thank you!! yes you're right this is part of the main {val dw 4 (\n) push val (\n) call square_me (\n) push EIP} I shouldn't have placed it there, anyway, I included push EIP so that the ret knows where to go back after executing the function. I read it somewhere. Do you have better things in mind how to do it? — srh snl, May 22 '13 at 02:28
I do have a few questions though, 1) does it mean that for all mathematical operations, we MUST use EAX ALONE? and Let's say I want to convert this mathematical op to assembly, SUM = a + b + c + d; this would involve a lot of pushing value onto the stack, and a lot of [EBP + something] to get what we have pushed? {push a (\n) push b (\n) push c (\n) pop eax (\n) add eax, [ebp + 8] (\n) add eax, [eax+ 16]} — srh snl, May 22 '13 at 02:52
2) I just would like to clarify that EDX automatically receives the same value received by EAX for all mathematical operations even though it is not seen in the code? let's say for example, [Mul EAX, EAX] and EAX = 3, if I use the EDX reg after that line [ADD EAX, EDX], it means, the resulting value is 18? If that's the case, then now I know why the answer in the document uses XOR EDX, EDX — srh snl, May 22 '13 at 03:17
Whenever you execute a `call` instruction, it automatically pushes EIP onto the stack so the `ret` knows where to return to - you don't need to do that yourself. Mathematical operations don't have to use EAX, but you generally can't have both operands being in memory - so you can add a register to a memory operand, or a register to another register, but you can't add two memory operands together. To know what operands can be used with each instruction, you really need a good opcode reference manual. — James Holderness, May 22 '13 at 09:04
The `mul` instruction is even more limited than most others, in that the one operand *must* be EAX and the the result is always stored in EDX:EAX (strangely, the `imul` instruction doesn't have these limitations). And the result isn't duplicated in EDX - it's a 64 bit result which is split across EDX and EAX. If you only care about the lower 32 bits you can ignore the EDX. — James Holderness, May 22 '13 at 09:07

score 0 · Answer 2 · edited May 23 '17 at 11:57

0

XOR EDX, EDX

Why does he need to do this? EDX wasnt use after that statement so what's the point?

It's being cleared to zero

In a larger program edx will probbly be >0 and may have valuable data

The mul will overwrite it anyway, if the numbers are big enough, but by putting "xor dx,dx" the programmer has sent you a smokesignal that dx is sometimes toast in this routine so you will need to push or store dx first

=============

mov EAX, DWORD PTR [EBP + 8]

What's the meaning of MOV EAX, DWORD PTR SS:[EBP+8h] and how can I translate it into AT&T format?

What does EBP+8 in this case in OllyDbg and Assembler mean?

edited May 23 '17 at 11:57

Community

1
1

answered May 20 '13 at 10:05

ady

155
1
8

Ok. So in C, it's like initializing a string to null to avoid garbage? Could you tell me if I converted that code in Assembly correctly? Thanks – srh snl May 20 '13 at 10:52
I would have a read of this http://stackoverflow.com/questions/10861478/what-does-ebp8-in-this-case-in-ollydbg-and-assembler-mean You are doing C calling conventions which I don't work with – ady May 20 '13 at 10:56

converting C code into assembly

2 Answers2