Why does GCC generate a "mov 0x8,%edx" instruction that causes a crash?

Question

I have a function which is declared like this:

void
tesysLog(W16 uid, char *file, int line, int level,
         W16 state, W16 event, int error, char *format, ...)

There is another func which will call tesysLog above, for example,:

tesysLog(253, __FILE__, __LINE__, 3, 0, 0, result,
        "error(code = %d) is %d instead of %d\n",
        avp->header.code, decoded, size);

The related assemly codes for the calling above are like:

0x00000000009027e5 <+117>: xor %r9d,%r9d      <---- clear r9d, means argv6 event = 0
0x00000000009027e8 <+120>: mov 0x8,%edx       <---- absolute address, but 0x8 is in reserved segment, crash here

0x00000000009027ef <+127>: xor %r8d,%r8d
0x00000000009027f2 <+130>: mov $0x3,%ecx
0x00000000009027f7 <+135>: mov $0x995600,%esi
0x00000000009027fc <+140>: mov $0xfd,%edi   
0x0000000000902801 <+145>: mov %ebp,0x20(%rsp)
0x0000000000902805 <+149>: mov %eax,0x18(%rsp)
0x0000000000902809 <+153>: xor %eax,%eax
0x000000000090280b <+155>: movq $0x995770,0x8(%rsp)
0x0000000000902814 <+164>: mov %edx,0x10(%rsp)
0x0000000000902818 <+168>: mov $0x8a,%edx
0x000000000090281d <+173>: movl $0xc8e4,(%rsp)
0x0000000000902824 <+180>: callq 0x9136e0 <tesysLog>

I got a signal 11, Segmentation fault, at the second line of the assembly codes, mov 0x8,%edx. Looks like this line is to prepare for the arg3 (int line) for tesysLog calling. But here, because the "absolute address" is being used, and 0x8 is in the reserved segment of the address space of the process, Segmentation fault is signaled in turn.

These codes are running on SLES, and compiled by gcc.

I am wondering why "absolute address" is being used. Is it a gcc bug, or is there compiling options affecting this?

Ah yes, I see the `callq` instruction calling the `tesysLog` function later in your assembly output. And you're right about moving from the absolute address. Can you please tell us how you build your code, what flags you have given to GCC (both for compilation and linking) and best of all try to create a [Minimal, Complete, and Verifiable Example](http://stackoverflow.com/help/mcve) to show us. — Some programmer dude, Jun 12 '17 at 10:15
I will get those info from our customer. Before getting that, is there compiling options will affect this behaviour? like -fPIC? — Jeff.Lu, Jun 12 '17 at 10:20
`mov 0x8,%edx` will load the constant 8 into edx, this cannot crash at all. can you make a objdump to show the instruction used, just to make sure ? — Tommylee2k, Jun 12 '17 at 12:17
@tommy That's what I first thought, but that's not right, as Jeff pointed out. It's not an immediate/constant 8 because its missing the leading `$`. As confirmation, look at the instruction offsets, which will give you their sizes. What he has is a 6-byte instruction, which is consistent with `mov edx, [0x8]` in Intel syntax. `mov edx, 8` would be shorter. — Cody Gray - on strike, Jun 12 '17 at 12:28
@CodyGray when I assemble "mov $08,%edx" with as or gcc, the opcodes in object file is `0xba,0x08,0x00,0x00,0x00` which is '0xb8+r" (for r=2 = edx) for "mov register, imm32" ... there's no memory address 8 involved. please do a objdump -D on that code, if it's 0xba, it's load immediate, not from memory. even GDB will set register edx to 8, when executing it ;) — Tommylee2k, Jun 12 '17 at 12:35
That's because you are assembling it with the leading `$`. That is missing from the instruction in Jeff's code. Yes, `mov $0x8, %edx` is a 5-byte instruction that loads an immediate value into `edx` (`ba 08 00 00 00`). Jeff has the 7-byte memory-load: `mov 0x8, %ebx` (`8b 14 25 08 00 00 00`), more familiar to us as `mov edx, ds:[0x8]`. — Cody Gray - on strike, Jun 12 '17 at 12:39
@CodyGray loading memory 0x08's content into edx would be `0x8b 0x15 0x08 0x00 0x00 0x00` — Tommylee2k, Jun 12 '17 at 12:39
@Tommylee2k Code Gray uses the absolute addressing mode, you use the ip-relative addressing mode. The former instruction is `mov 8, %ebx` while the latter is ` mov 8(%rip), %ebx` in AT&T syntax and as we can see the disassembly, it's clearly the former. — fuz, Jun 12 '17 at 12:50
Can you please show us your entire program or a minimal example derived from it? I can't see how the snippet you posted could cause this error, there has to be something I am missing. — fuz, Jun 12 '17 at 12:51
No, this is the standard System V AMD64 calling convention, popular on Unix-based systems. The first 6 integer parameters are passed in RDI, RSI, RDX, RCX, R8, and R9. — Cody Gray - on strike, Jun 12 '17 at 13:09
it'S not "__LINE__" which gcc thinks is in 0x08, it's the 3rd last parameter pushed on the stack, so most probably "header.code" (edx is - short before the call - stored in rsp+10, and then loaded with 0x8a (line 138)). If you showed us the code, that'd most probably reveal you're accessing this one wrong. (thx @ CodyGray for clearing the question about calling convention) — Tommylee2k, Jun 12 '17 at 13:15

Tommylee2k · Accepted Answer · 2017-06-12T14:29:07.297

1

void
tesysLog(W16 uid, char *file, int line, int level,
         W16 state, W16 event, int error, char *format, ...)

tesysLog(253, _ _FILE__, _ _LINE__, 3, 
        0, 0, result,     "error(code = %d) is %d instead of %d\n",
        header.code, decoded, size);

parameters will be in registers:

rdi : W16 uid
rsi : char *file
rcx : int line
rdx : int level
r8 : W16 state
r9 : W16 event
stack: char *format, ...)

you're right, edx is the 3rd parameter, but:

you have to check what is edx right before the call, it's not 0x08 ... it's $0x8a (line 138), so "_ _LINE__" is not the one causing troubles, it's the value that's stored in (%rsp+10), which is "header.code"

Edit: nonsense. mode = 138, not line!!

0x00000000009027e8 <+120>: mov 0x8,%edx         ; here EDX is just a tmp variable
...
0x0000000000902814 <+164>: mov %edx,0x10(%rsp)  ; for THIS value!
0x0000000000902818 <+168>: mov $0x8a,%edx       ; <-- THIS is edx on call

if you'd reveal the code, we could find the problem ... i'm like 99,9% sure that you're using "header.code" wrong ;-)

edited Jun 12 '17 at 14:29

answered Jun 12 '17 at 14:14

Tommylee2k

2,683
1
9
22

thank you very much. I will reveal the codes when I am in office tomorrow. BTW, the "line 138" mentioned in your answer, are you really meaning line 168, as there isn't line 138 in the snippet. – Jeff.Lu Jun 12 '17 at 14:21
And one more thing I don't understand , as reg edx will be replaced by immediate value $0x8a which is the real line number in my codes, why gcc wants to access the memory at location 0x8? – Jeff.Lu Jun 12 '17 at 14:29
1

I think I get you point now. And yes, you are right. "mov 0x8, %edx" is loading header.code. Actually our code relating header.code is **"avp->header.code". (I have updated this in my question )**. And before calling tesysLog, pointer avp will be set to NULL in another function, and member header in avp has **offset of 8 bytes**. That's why gcc generates "mov 0x8, %edx". Thanks for your good answer which helps me understand what the problem is. Big thanks again. – Jeff.Lu Jun 13 '17 at 02:21

Why does GCC generate a "mov 0x8,%edx" instruction that causes a crash?

1 Answers1