1

I am working on a project that requires a double-width-compare-and-swap operation (cmpxchg16b). I found the following code by luke h, however when I compile it with "g++-4.7 -g -DDEBUG=1 -std=c++0x dwcas2.c -o dwcas2.o" I get the following error:

Error:

g++-4.7 -g -DDEBUG=1 -m64 -std=c++0x dwcas2.c -o dwcas2.o
dwcas2.c: Assembler messages:
dwcas2.c:29: Error: junk `ptr ' after expression

Any ideas why?, I feel like it is something small and easy to fix, I just can not see it.

Computer Specs: 64-core ThinkMate RAX QS5-4410 server running Ubuntu 12.04 LTS. It is a NUMA system with four AMD Opteron 6272 CPUs (16 cores per chip @2.1 GHz) and 314 GB of shared memory.

Code:

#include <stdint.h>

namespace types
{
    struct uint128_t
    {
        uint64_t lo;
        uint64_t hi;
    }
    __attribute__ (( __aligned__( 16 ) ));
}

template< class T > inline bool cas( volatile T * src, T cmp, T with );

template<> inline bool cas( volatile types::uint128_t * src, types::uint128_t cmp, types::uint128_t with )
{
    bool result;
    __asm__ __volatile__
    (
        "lock cmpxchg16b oword ptr %1\n\t"
        "setz %0"
        : "=q" ( result )
        , "+m" ( *src )
        , "+d" ( cmp.hi )
        , "+a" ( cmp.lo )
        : "c" ( with.hi )
        , "b" ( with.lo )
        : "cc"
    );
    return result;
}

int main()
{
    using namespace types;
    uint128_t test = { 0xdecafbad, 0xfeedbeef };
    uint128_t cmp = test;
    uint128_t with = { 0x55555555, 0xaaaaaaaa };
    return ! cas( & test, cmp, with );
}
Community
  • 1
  • 1
Steven Feldman
  • 833
  • 1
  • 15
  • 28
  • It should just be `lock cmpxchg16b %1` . The size in this case isn't needed since it is implied by the instruction `cmpxchg16b` . It appears wherever you got this code was written by someone who was thinking the inline assembler was equivalent to _MASM_. – Michael Petch Jun 09 '16 at 19:04
  • 1
    Can I make a pitch for NOT using inline asm? How about using [__sync_bool_compare_and_swap](https://gcc.gnu.org/onlinedocs/gcc-4.4.3/gcc/Atomic-Builtins.html)? Or maybe even std::atomic? – David Wohlferd Jun 09 '16 at 22:03
  • @DavidWohlferd Oh come now, intrinsics are too easy, but then again you probably want your wiki entry on GCC to count for something ;-) – Michael Petch Jun 09 '16 at 22:15
  • I stumbled on this other question/answer that seems to be the origin for this code. http://stackoverflow.com/questions/4825400/cmpxchg16b-correct – Michael Petch Jun 09 '16 at 22:19
  • @MichaelPetch - Hey, I *sweated* over that wiki entry. Of course you and I are probably the only people who have ever read it, but still... Re 4825400: It doesn't have a memory clobber in the accepted answer either. While I see you added a comment, you didn't mention that. Is it worth it quibbling on this 5+ year old question? – David Wohlferd Jun 10 '16 at 02:20
  • @DavidWohlferd I can name one other: PeterCordes - I know he has read it. – Michael Petch Jun 10 '16 at 02:21
  • I only stumbled on these questions after it became active due to an edit today. – Michael Petch Jun 10 '16 at 02:23

1 Answers1

2

On x86 GCC defaults to using AT&T syntax assembly, but your source is in Intel syntax. You probably also need "memory" in the clobber list.

Timothy Baldwin
  • 3,551
  • 1
  • 14
  • 23
  • I have very limited experience with assembly, can you provide an example? – Steven Feldman Sep 21 '13 at 18:25
  • "memory" in the clobber list wouldn't be necessary in this case since the memory that is changed is explicitly listed as an output operand `"+m" ( *src )` – Michael Petch Jun 09 '16 at 18:48
  • 1
    @MichaelPetch I'm not sure I agree. While it may not be necessary for the correct function of the cmpxchg16b, usually this type of statement is used for syncing threads. In that case, you probably DO want to flush everything to memory before executing the asm. – David Wohlferd Jun 09 '16 at 22:05
  • @DavidWohlferd Actually that is a very good point lol. I wasn't thinking about the requirement around the instruction itself in the situation you'd likely be using it. I do actually concur – Michael Petch Jun 09 '16 at 22:07
  • I'll keep the comment despite the fact it may not be accurate. This line of comments may be useful to future readers who gloss over it quickly; ) although maybe an answer with more beefin might be appropriate. – Michael Petch Jun 09 '16 at 22:09
  • Now that gcc 6 has been released, it might also be worth mentioning [Flag Output Operands](https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#FlagOutputOperands). Using this you can cut the number of asm instructions here in half! Well, ok, not a huge win, but if you refuse to use intrinsics, it helps. – David Wohlferd Jun 10 '16 at 02:26