10

This works just fine:

#include <stdio.h>

int main(){
    volatile int abort_counter = 0;
    volatile int i = 0;
    while (i < 100000000) {
        __asm__ ("xbegin ABORT");
        i++;
        __asm__ ("xend");
        __asm__ ("ABORT:");
        ++abort_counter;
    }

    printf("%d\n", i);
    printf("nof abort-retries: %d\n",abort_counter-i);
    return 0;
}

However, what I originally wrote was

#include <stdio.h>

int main(){
    volatile int abort_counter = 0;
    volatile int i = 0;
    while (i < 100000000) {
        __asm__ ("xbegin ABORT");
        i++;
        __asm__ ("xend");
        continue;
        __asm__ ("ABORT:");
        ++abort_counter;
    }

    printf("%d\n", i);
    printf("nof abort-retries: %d\n",abort_counter);
    return 0;
}

but this led to

/tmp/cchmn6a6.o: In function `main':
rtm_simple.c:(.text+0x1a): undefined reference to `ABORT'
collect2: error: ld returned 1 exit status

Why?

(Compiled with gcc rtm_simple.c -o rtm_simple.)

librik
  • 3,738
  • 1
  • 19
  • 20
User1291
  • 7,664
  • 8
  • 51
  • 108
  • 4
    It's likely that the compiler optimizes away everything after `continue`, since it cannot be reached (and it also doesn't analyze assembler code, so it doesn't know what you're doing). – vgru Nov 07 '17 at 09:23
  • 3
    This is just a guess but I would think the compiler would see the lines after continue as being dead code (never going to be executed) so it removed it at compile time. The C compiler doesn't know what "xbegin ABORT" means . – Brian Walker Nov 07 '17 at 09:24
  • 1
    Does using `__asm__ __volatile__` fix it? –  Nov 07 '17 at 09:32
  • 1
    did you check the output of `gcc -S rtm_simple.c` ? – alinsoar Nov 07 '17 at 09:34
  • @FelixPalmen I'm afraid not. – User1291 Nov 07 '17 at 09:42
  • @Groo ,BrianWalker, alinsoar this seems to be the case. Thank you. Can I suppress dead code elimination, somehow? – User1291 Nov 07 '17 at 09:43
  • 2
    You might be able to trick it: `continue; reachable: __asm__("ABORT:"); ++abort_counter; } ... if (abort_counter < 0) goto reachable;` – melpomene Nov 07 '17 at 09:54
  • @melpomene that worked, thank you. Want to make an answer out of it? – User1291 Nov 07 '17 at 09:58
  • On an unrelated note, you are aware that the compiler could generate the code in this order `__asm__ ("xbegin ABORT");` `__asm__ ("xend");` `i++;` ? Basic __asm__ statements although implicitly volatile can be reordered relative to other C statements. – Michael Petch Nov 07 '17 at 14:05
  • @FelixPalmen : basic asm are statements are implicitly **volatile** since there are no output operands. – Michael Petch Nov 07 '17 at 14:12
  • @MichaelPetch I was not aware of that, thank you. What's the proper way to prevent this? I'm tempted to say ``mfence``, but I guess that would be another asm clause that the compiler could reorder. – User1291 Nov 08 '17 at 07:40
  • @User1291 : My answer covers some of that briefly. The answer is in the GCC documentation - create _artificial dependencies_ to maintain order. In order to do that you need to understand extended assembly and constraints. In your case you'd pass an artificial dependency into each asm block that makes the compiler think that they rely on the value of `i`. The last code example in my answer suggests one way to do it. – Michael Petch Nov 08 '17 at 07:54
  • Extended assembly templates are powerful but difficult to master and get correct. Very easy to get them wrong. – Michael Petch Nov 08 '17 at 07:58
  • @MichaelPetch must have missed that part of your answer. Thank you. – User1291 Nov 08 '17 at 08:02

2 Answers2

7

You might be able to trick it:

        continue;
        reachable:
        __asm__ ("ABORT:");
        ++abort_counter;
    }

    printf("%d\n", i);
    printf("nof abort-retries: %d\n",abort_counter);
    if (abort_counter < 0) goto reachable;
    return 0;
}

The goto with the label tells gcc that the code is reachable, and abort_counter being volatile should prevent gcc from being able to optimize the goto away.

melpomene
  • 84,125
  • 8
  • 85
  • 148
  • 1
    This is not guaranteed to be safe, because the compiler doesn't know that it's reachable *from `__asm__ ("xbegin ABORT");`*. It's probably sort of safe because `abort_counter` is volatile, but it might have other stuff in registers that it was expecting to store after the `asm` statement. So it might work for this toy example, but could easily break with real surrounding code. An `asm` statement that jumps without using `asm goto` is *not* safe in general. – Peter Cordes Nov 24 '17 at 08:45
7

The reason you get the error in this code:

    __asm__ ("xbegin ABORT");
    i++;
    __asm__ ("xend");
    continue;
    __asm__ ("ABORT:");
    ++abort_counter;

Is because the compiler saw everything after the continue statement until the end of the block (the while loop) as dead code. GCC doesn't understand what a particular asm block does, so it doesn't know that the label ABORT was used in __asm__ ("xbegin ABORT"); . By eliminating the dead code the jump target was eliminated and when the linker tried to resolve the label it was gone (undefined).


As an alternative to the other answer - as of GCC 4.5 (still unsupported in CLANG) you can use extended assembly with the asm goto statement:

Goto Labels

asm goto allows assembly code to jump to one or more C labels. The GotoLabels section in an asm goto statement contains a comma-separated list of all C labels to which the assembler code may jump. GCC assumes that asm execution falls through to the next statement (if this is not the case, consider using the __builtin_unreachable intrinsic after the asm statement). Optimization of asm goto may be improved by using the hot and cold label attributes (see Label Attributes).

The code could have been written like this:

while (i < 100000000) {
    __asm__ goto("xbegin %l0"
                 : /* no outputs  */
                 : /* no inputs   */
                 : "eax"   /* EAX clobbered with status if an abort were to occur */
                 : ABORT); /* List of C Goto labels used */
    i++;
    __asm__ ("xend");
    continue;
ABORT:
    ++abort_counter;
}

Since the compiler now knows that inline assembly may use the label ABORT as a jump target it can't just optimize it away. As well, using this method we don't need to place the ABORT label inside an assembly block, it can be defined using a normal C label which is desirable.

Being picky with the code above: although __asm__ ("xend"); is volatile because it is a basic asm statement, the compiler could reorder it and place it before the i++ and that wouldn't be what you'd want. You can use a dummy constraint that makes the compiler think it has a dependency on the value in variable i with something like:

__asm__ ("xend" :: "rm"(i));

This will ensure that i++; will be placed before this assembly block since the compiler will now think that our asm block relies on the value in i. The GCC documentation has this to say:

Note that even a volatile asm instruction can be moved relative to other code, including across jump instructions. [snip] To make it work you need to add an artificial dependency to the asm referencing a variable in the code you don't want moved


There is another alternative that should work on GCC/ICC/CLANG and that is to rework the logic. You could increment the abort_counter inside the assembly template if the transaction aborts. You'd pass it in as an input and output constraint. You can also use GCC's local labels to define unique labels:

Local Labels

Local labels are different from local symbols. Local labels help compilers and programmers use names temporarily. They create symbols which are guaranteed to be unique over the entire scope of the input source code and which can be referred to by a simple notation. To define a local label, write a label of the form ‘N:’ (where N represents any non-negative integer). To refer to the most recent previous definition of that label write ‘Nb’, using the same number as when you defined the label. To refer to the next definition of a local label, write ‘Nf’. The ‘b’ stands for “backwards” and the ‘f’ stands for “forwards”.

The code for the loop could have looked like this:

while (i < 100000000) {
    __asm__ __volatile__ ("xbegin 1f" : "+rm"(i) ::
                          : "eax");   
                          /* EAX is a clobber since aborted transaction will set status */
                          /* 1f is the local label 1: in the second asm block below */
                          /* The "+rm"(i) constraint is a false dependency to ensure 
                             this asm block will always appear before the i++ statement */
    i++;
    __asm__ __volatile__ ("xend\n\t"
             "jmp 2f\n"   /* jump to end of asm block, didn't abort */
             "1:\n\t"     /* This is the abort label that xbegin points at */
             "incl %0\n"  /* Increment the abort counter */
             "2:"         /* Label for the bottom of the asm block */
             : "+rm"(abort_counter)
             : "rm"(i));   /* The "rm"(i) constraint is a false dependency to ensure 
                              this asm block will always appear after the i++ statement */
}

If your compiler supports it (GCC 4.8.x+), use GCC's transactional intrinsics. This helps eliminate the use of inline assembly altogether and is one less vector that can go wrong in your code.

Michael Petch
  • 46,082
  • 8
  • 107
  • 198