14

Going trough chapter 3 of this book called Computer Systems Architecture: A programmer's perspective, it is stated that an implementation like

testl %eax, %eax
cmovne (%eax), %edx

is invalid because if the prediction fails, then we'll have NULL dereferencing. It is also stated that we should use branching code.

Still, wouldn't using conditional jumps lead to the same result? For example:

.L1:
jmp *%eax

testl %eax, %eax
jne .L1

Is it possible to trick gcc to output something like that for an x86-32? Suppose I have an array of pointers to functions of which some are valid and some aren't and I call each one that's not NULL.

Alex C
  • 923
  • 9
  • 23
  • That sounds unlikely. The CPU shouldn't do anything that would make it appear it's not executing exactly what you told it to (assuming a single-threaded environment). This would be a major breaking change to the developer-CPU contract. Have you tried executing that case? I'd expect it to ignore the exceptional case if the branch prediction turns out to be incorrect. – Luaan Jul 07 '14 at 13:20
  • 1
    I would not close it. This sounds like a valid question to me. Not sure where it is too broad. – TomTom Jul 07 '14 at 13:20
  • 1
    I'd say the CPU architecture needs to handle that situation or there wouldn't be any working programs left. Still an interesting question. – Mark Ransom Jul 07 '14 at 13:20
  • 4
    The manual says: _exceptions and interrupts are not signaled until actual "in-order" execution of the instructions_. See also [this question](http://stackoverflow.com/questions/20978884/can-branch-prediction-cause-illegal-instruction). – Jester Jul 07 '14 at 13:26
  • See also: http://stackoverflow.com/questions/17310851/why-the-conditional-move-doesnt-work-properly – Pascal Cuoq Jul 07 '14 at 13:34
  • @Luaan I tried but didn't manage to make it crash using conditional jumps. I've also corrected the code since jne is stated only to work with labels – Alex C Jul 07 '14 at 13:54
  • I don't quite understand the quote from the accepted answer [link](http://stackoverflow.com/questions/20978884/can-branch-prediction-cause-illegal-instruction) - that is, "decoding and speculatively attempting to execute an invalid opcode does not generate this exception." Does decoding and speculatively execute include "grabbing 4 bytes from virtual address 0"? – Alex C Jul 07 '14 at 13:58
  • 2
    @AlexC: I think the linked question answers your exact concern - you will not get a #PF exception from speculative execution, because you will not get 4 bytes from virtual address zero, because `eax` is not zero. No, branch prediction (speculative execution) will not crash your program(s). – Michael Foukarakis Jul 07 '14 at 14:18
  • 1
    I seem to recall that there was a case (which the Linux kernel encountered) where gcc would optimize out null pointer checks if the pointer was previously dereferenced. I think this was gcc exploiting C undefined behavior for null pointer dereferencing, resulting in something less bad than nasal demons but still unpleasant. –  Jul 07 '14 at 23:15
  • [This blog post](http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html) described the compiler optimization issue. –  Jul 07 '14 at 23:21

1 Answers1

5

No. You should not be able to detect out-of-order operand fetch of a jmp instruction if it is part of a speculative execution that proves invalid due to a test and jump.

The cmove__ instruction is precisely documented to cause a fault if a memory access operand would cause a fault, even if the condition is not met. In other words, this is not speculative execution. It's part of the instruction semantics. It's the move to desination that's conditional, not the fetch.

The jmp instruction is not so-documented.

I'm not getting the point of your example code because there is no condition on the memory operation *%eax. If %eax contains zero, certainly the fetch in the unconditional execution of jmp *%eax will cause a fault. This is correct behavior. If you test %eax and jump around the bad reference.

testl %eax, %eax
je .L1
jmp *%eax
.L1:

There can't be a problem. Speculative execution of the *%eax cannot cause a fault unless the speculation turns out to be valid, i.e. the true control path. This is similar to behavior for bad opcodes, division by zero and the like: normal program semantics are not affected by speculative execution.

Where out-of-order fetches and stores really do cause all kinds of interesting problems is in multi-processing. This article and also its first part in the preceeding issue are great discussions of this topic.

Gene
  • 46,253
  • 4
  • 58
  • 96
  • 1
    One does not need out-of-order execution to get memory consistency issues (for an architecture not guaranteeing sequential consistency). The delay of communication among caches can be enough to allow reads and writes to have inconsistent order. In fact, out-of-order execution was proposed as a means to provide good performance under sequential consistency ("Is SC + ILP = RC?", Chris Gniady et al., 1999, [PDF](http://www.cs.arizona.edu/~gniady/papers/isca99_scrc.pdf)). –  Jul 07 '14 at 22:59
  • @PaulA.Clayton Of course you are right. I was thinking from the other end - that it adds an element of complexity, e.g. in getting correctness in wait-free algorithms. – Gene Jul 07 '14 at 23:41
  • @Gene: I think you've got the point, I've added the unconditional one because conditional jumps can only be used with labels. I was thinking of a case where %eax was being computed and could be either 0 or an address. Maybe we can improve the question to be more clear? – Alex C Jul 08 '14 at 07:40