10

My knowledge of the intel instruction set is a bit rusty. Can you tell me why I might be getting a segmentation fault in the optimized version of my function (bonus points if you can tell me why I don't get it in the -O0 build of the code.

It's C code compiled by GCC 4.1.2.

Here is the result of GDB's "disas" command at the crash:

   0x00000000004263e5 <+0>:     sub    $0x8,%rsp
   0x00000000004263e9 <+4>:     movsd  %xmm2,(%rsp)
   0x00000000004263ee <+9>:     divsd  %xmm1,%xmm0
   0x00000000004263f2 <+13>:    callq  0x60f098 <log@plt>
=> 0x00000000004263f7 <+18>:    andpd  0x169529(%rip),%xmm0        
   0x00000000004263ff <+26>:    movsd  (%rsp),%xmm1
   0x0000000000426404 <+31>:    ucomisd %xmm0,%xmm1
   0x0000000000426408 <+35>:    seta   %al
   0x000000000042640b <+38>:    movzbl %al,%eax
   0x000000000042640e <+41>:    add    $0x8,%rsp
   0x0000000000426412 <+45>:    retq   

And here's the original source of the function:

char is_within_range(double a, double b, double range) {
  double ratio = a / b;
  double logRatio = fabs(log(ratio));
  return logRatio < range;
}

For reference here's the non-optimized version of the code:

   0x00000000004263e5 <+0>: push   %rbp
   0x00000000004263e6 <+1>: mov    %rsp,%rbp
   0x00000000004263e9 <+4>: sub    $0x30,%rsp
   0x00000000004263ed <+8>: movsd  %xmm0,-0x18(%rbp)
   0x00000000004263f2 <+13>:    movsd  %xmm1,-0x20(%rbp)
   0x00000000004263f7 <+18>:    movsd  %xmm2,-0x28(%rbp)
   0x00000000004263fc <+23>:    movsd  -0x18(%rbp),%xmm0
   0x0000000000426401 <+28>:    divsd  -0x20(%rbp),%xmm0
   0x0000000000426406 <+33>:    movsd  %xmm0,-0x10(%rbp)
   0x000000000042640b <+38>:    mov    -0x10(%rbp),%rax
   0x000000000042640f <+42>:    mov    %rax,-0x30(%rbp)
   0x0000000000426413 <+46>:    movsd  -0x30(%rbp),%xmm0
   0x0000000000426418 <+51>:    callq  0x610608 <log@plt>
   0x000000000042641d <+56>:    movapd %xmm0,%xmm1
   0x0000000000426421 <+60>:    movsd  0x16b6b7(%rip),%xmm0
   0x0000000000426429 <+68>:    andpd  %xmm1,%xmm0
   0x000000000042642d <+72>:    movsd  %xmm0,-0x8(%rbp)
   0x0000000000426432 <+77>:    movsd  -0x8(%rbp),%xmm1
   0x0000000000426437 <+82>:    movsd  -0x28(%rbp),%xmm0
   0x000000000042643c <+87>:    ucomisd %xmm1,%xmm0
   0x0000000000426440 <+91>:    seta   %al
   0x0000000000426443 <+94>:    movzbl %al,%eax
   0x0000000000426446 <+97>:    leaveq 
   0x0000000000426447 <+98>:    retq   
laslowh
  • 8,482
  • 5
  • 34
  • 45

3 Answers3

6
=> 0x00000000004263f7 <+18>:    andpd  0x169529(%rip),%xmm0        
   0x00000000004263ff <+26>:    movsd  (%rsp),%xmm1

When the andpd instruction takes a memory operand, it's required to be aligned to a 16-byte boundary.

For %rip-relative addressing, the offset is applied to the address of the following instruction. So, here, the memory operand is at 0x4263ff + 0x169529 = 0x58f928, which is not 16-byte aligned. Hence the segfault.

The compiler is directly generating code for fabs(), using an AND with an appropriate bit mask to clear the sign bit; the bit mask constant value should have been placed at an appropriate offset in a sufficiently aligned data section, but hasn't been. This could be a bug in that (old) version of GCC, or could conceivably be a linker-related issue somewhere else.

Matthew Slattery
  • 45,290
  • 8
  • 103
  • 119
  • The follow up to this, is that your answer is spot on, it turned out to be a linker bug in a non-standard linker that we're using. Thanks for the answer. – laslowh Feb 24 '12 at 17:27
1

It seems to crash after the call to the log function:

callq  0x60f098 <log@plt>

So there's maybe a problem with the fabs implementation, using -O0.

Have you tried:

double logRatio = log(ratio);
logRatio = fabs(logRatio);

This may generate a different assembly output, and you may get additional infos about the crash.

As an alternative, you may replace the fabs call with something like:

double logRatio = log(ratio);
logRatio = (logRatio < 0) -logRatio : logRatio;

You may have precision issues with that, but that's not the point here...

Macmade
  • 52,708
  • 13
  • 106
  • 123
1

I'm also using gcc (GCC) 4.1.2 20070115 (SUSE Linux), here's the generated assembly:

Dump of assembler code for function is_within_range:
0x0000000000400580 <is_within_range+0>: divsd  %xmm1,%xmm0
0x0000000000400584 <is_within_range+4>: sub    $0x8,%rsp
0x0000000000400588 <is_within_range+8>: movsd  %xmm2,(%rsp)
0x000000000040058d <is_within_range+13>:        callq  0x400498 <log@plt>
0x0000000000400592 <is_within_range+18>:        andpd  358(%rip),%xmm0        # 0x400700
0x000000000040059a <is_within_range+26>:        xor    %eax,%eax
0x000000000040059c <is_within_range+28>:        movsd  (%rsp),%xmm1
0x00000000004005a1 <is_within_range+33>:        ucomisd %xmm0,%xmm1
0x00000000004005a5 <is_within_range+37>:        seta   %al
0x00000000004005a8 <is_within_range+40>:        add    $0x8,%rsp
0x00000000004005ac <is_within_range+44>:        retq

It appears to be almost the same, but I do not get a crash. I think you'll need to provide us with your compiler flags, and details of your processor and GLIBC version, and the values of a, b, and range that crash for you, as the issue is almost definitely with the log call.

Matt Joiner
  • 112,946
  • 110
  • 377
  • 526