1

I want to swap two int pointers.

I can do it in C like this

void swap(int* a, int* b)
    int temp = *a;
    *a = *b;
    *b=temp;

Now I'm trying to do it in assembly but, and bear with me, I don't understand why this doesn't work

push %ebp
mov %esp,%ebp   

mov 8(%ebp), %ecx       #a 
mov 12(%ebp), %ebx      #b

mov (%ecx), %eax        #move o a para temp
mov (%ebx), %ecx        #move o b para o a
mov (%eax), %ebx        #move o temp para o b

mov %ebp, %esp
pop %ebp
ret

Can someone explain what I'm doing wrong?

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
codetective
  • 86
  • 2
  • 9
  • 1
    Because you'd need to write `mov (%ebx), (%ecx)` to do `*b = *a;`, and x86 doesn't allow both operands to be indirect (except in `movs`). – EOF Apr 22 '17 at 18:32
  • 2
    The C code does not "swap two int pointers". It is swapping the data they point to. – Weather Vane Apr 22 '17 at 18:35
  • 1
    I'm surprised nobody's mentioned `xchg` yet. – Daniel Kamil Kozar Apr 22 '17 at 19:14
  • 2
    @DanielKamilKozar `xchg` doesn't help much for this, and it's much slower than a load and a store. – EOF Apr 22 '17 at 19:21
  • 2
    @DanielKamilKozar : You wouldn't want to use `xchg` with memory operands here since that would apply a _LOCK_ to the instruction. To avoid that you'd have to move the memory operands into registers and _xchg_ the registers but by the time you do that you may as well have used `mov`. – Michael Petch Apr 22 '17 at 19:43
  • @MichaelPetch : Right. I forgot about the implicit LOCK that `xchg` has. Thanks. – Daniel Kamil Kozar Apr 22 '17 at 19:44
  • Remember that if you can write the C code, then you can always run it through a C compiler and disassemble it to see what the assembly code should look like. [In this case, for example.](https://godbolt.org/g/7WQTfr) – Cody Gray - on strike Apr 23 '17 at 10:05

2 Answers2

2

Your C code is attempting to swap the values pointed to by pointers but your assembly code seems to be treating values as pointers leading to segmentation faults. One way to handle this is to use extra register(s) to hold the pointers and the values they point to.

The 32-bit CDECL allows us to use EAX, ECX, and EDX without worrying about saving their values. We'll need a 4th register and we'll have to preserve it ourselves.

I will also assume you want the frame pointer present (the routine could be written without it):

push   %ebp
mov    %esp,%ebp
push   %ebx              # Preserve non-volatile EBX register
mov    8(%ebp),%eax      # EAX = Pointer to a (arg1)
mov    12(%ebp),%edx     # EDX = Pointer to b (arg2)
mov    (%eax),%ecx       # Temporarily save value at memory address a (*a) to ECX
mov    (%edx),%ebx       # Temporarily save value at memory address b (*b) to EBX
mov    %ebx,(%eax)       # Move value of b (EBX) to memory address of a (*a)
mov    %ecx,(%edx)       # Move value of a (ECX) to memory address of b (*b)
pop    %ebx              # Restore non-volatile EBX register
pop    %ebp
ret

In theory you could remove the stack frame altogether (may make stack traces in the debugger harder to follow). The code could have used ESP to access the arguments rather than EBP:

push   %ebx              # Preserve non-volatile EBX register
mov    8(%esp),%eax      # EAX = Pointer to a
mov    12(%esp),%edx     # EDX = Pointer to b
mov    (%eax),%ecx       # Temporarily save value at memory address a (*a) to ECX
mov    (%edx),%ebx       # Temporarily save value at memory address b (*b) to EBX
mov    %ebx,(%eax)       # Move value of b (EBX) to memory address of a (*a)
mov    %ecx,(%edx)       # Move value of a (ECX) to memory address of b (*b)
pop    %ebx              # Restore non-volatile EBX register
ret

Note: Failing to preserve (per 32-bit CDECL calling convention) non-volatile registers like EBX, EBP, ESI, and EDI is a bug. In simple test code it may appear to work, but in more complex code with optimizations on you may experience undefined behaviour if the calling convention isn't strictly adhered to.

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • How is pushing/popping a register functionally different from using a stack temporary? – EOF Apr 22 '17 at 19:02
  • @EOF: Actually that comment applied to the first incarnation of my code where I didn't preserve _EBX_. Thanks for noticing. That comment doesn't apply to the code I posted so I have removed it. – Michael Petch Apr 22 '17 at 19:04
  • \*Sigh\*, here I was hoping to digress to the effects of load-store forwarding latency and OoO-execution and not needing the popped value right away, possibly to better ABIs with redzones... But omitting the framepointer is probably better anyway. – EOF Apr 22 '17 at 19:10
  • Redzone doesn't apply to 32-bit Linux code (do I didn't bother for this answer), but it would apply to 64-bit code. And yes in that case you could simply use the memory address below _RSP_ without adjusting _RSP_. 64-bit Linux code is easier because the pointers would already be in registers (_RDI_and _RSI_) so you'd only need a couple of the extra volatile registers to hold the values. Wouldn't need temporary stack space at all. – Michael Petch Apr 22 '17 at 19:23
  • Thanks! Great explanation – codetective Apr 22 '17 at 21:32
1

As Weather Vane said, the C code you have shown is not swapping two int pointers. It is swapping two ints pointed by two int pointers.

But in your assembly you appear to be trying to swap two ints pointed by two int pointers, so your code is not an entirely lost cause.

Study this to understand what it does, and then give it a try:

mov (%ecx), %eax
mov (%ebx), %edx
mov %edx, (%ecx)
mov %eax, (%ebx)
Mike Nakis
  • 56,297
  • 11
  • 110
  • 142
  • 1
    You might want to have a second look at your answer. – EOF Apr 22 '17 at 18:42
  • @EOF Whoops! C-:= – Mike Nakis Apr 22 '17 at 18:43
  • @EOF damn this nonsensical syntax. – Mike Nakis Apr 22 '17 at 18:43
  • Don't tell me you prefer the shouty `MOV DWORD PTR ...`? – EOF Apr 22 '17 at 18:44
  • @EOF no, I prefer the `mov eax, [ecx]` with the assembler figuring out that I mean `dword ptr` since I am assigning to `eax`. – Mike Nakis Apr 22 '17 at 18:45
  • Eh, to each their own I guess. I like AT&T syntax better. – EOF Apr 22 '17 at 18:46
  • @EOF sure, to each their own. But all these syntaxes suffer from issues. I would have preferred a verb other than 'mov' because it is really copying, not moving. – Mike Nakis Apr 22 '17 at 18:47
  • @MikeNakis Well, I prefer the assembly languages that are explicit about the type of memory access they do. `mov` with memory operands *should* be `load/store` or `read/write` or something like that. Not coincidentally, that would have prevented this entire comment chain... – EOF Apr 22 '17 at 18:49
  • Thank you, I understand now. I'll change the C method to actually swapping the pointers then! – codetective Apr 22 '17 at 18:59
  • @codetective : You want to swap the values pointed to by the pointers, not the pointers themselves. – Michael Petch Apr 22 '17 at 20:52
  • It truly boggles my mind how, when faced with [these two options for syntax](https://godbolt.org/g/4GIaqG), someone could prefer the top format. That just looks like an inscrutable mess of symbols to me, even though I know what it means. The x86 is a pseudo-RISC architecture internally, but the syntax programmers write in is never going to be. I don't believe that having separate load/store and read/write instructions would actually decrease confusion. People just wouldn't know about them, the same way as they don't know how to notate the difference in the existing syntax. – Cody Gray - on strike Apr 23 '17 at 10:09