2

I'm catching an assembler error when using inline assembly and a local label. The compiler is GCC, and the machine is PowerPC running AIX. The code reads the timestamp (it is roughly equivalent to rdtsc):

static unsigned long long cpucycles( void )
{
  unsigned long long int result=0;
  unsigned long int upper, lower,tmp;
  __asm__ __volatile__ (
                "0:             \n\t"
                "mftbu   %0     \n\t"
                "mftb    %1     \n\t"
                "mftbu   %2     \n\t"
                "cmpw    %2,%0  \n\t"
                "bne-    0b     \n\t"
                : "=r"(upper),"=r"(lower),"=r"(tmp)
                : :
                );
  result = upper;
  result = result<<32;
  result = result|lower;
  return(result);
}

When the code is assembled it results in:

gcc -O3 -Wall -Wextra -mcpu=power8 -maltivec test.c -o test.exe
Assembler:
test.s: line 103: 1252-142 Syntax error.

Compiling with --save-temps and examining test.s:

$ cat -n test.s
...

   101  L..5:
   102   # 58 "test.c" 1
   103          0:
   104          mftbu   10
   105          mftb    9
   106          mftbu   8
   107          cmpw    8,10
   108          bne     0b
   109

It looks like the assembler is having trouble with the local label. Based on IBM's Use of inline assembly and local labels I believe the label and branch are being used correctly:

Only some local labels are legal in inline assembly. You might see labels, such as 0 and 1 in Code C. They are the branching target of instruction bne- 0b\n\t and bne 1f\n\t. (The f suffix for the label means the label behind the branch instruction, and b is for the one ahead)

IBM's error message for 1252-142 is not very helpful:

Cause

If an error occurred in the assembly processing and the error is not defined in the message catalog, this generic error message is used. This message covers both pseudo-ops and instructions. Therefore, a usage statement would be useless.

Action

Determine intent and source line construction, then consult the specific instruction article to correct the source line.

What is the problem and how do I fix it?


Based on @Eric's suggestions in the comments:

__asm__ __volatile__ (
  "\n0:           \n\t"
  "mftbu   %0     \n\t"
  "mftb    %1     \n\t"
  "mftbu   %2     \n\t"
  "cmpw    %2,%0  \n\t"
  "bne-    0b     \n\t"
  : "=r"(upper),"=r"(lower),"=r"(tmp)
);

Results in the problem moving one line down:

gcc -O3 -Wall -Wextra -mcpu=power8 -maltivec test.c -o test.exe
Assembler:
test.s: line 104: 1252-142 Syntax error.

But it looks like the label is in column 0:

 103
 104  0:
 105          mftbu   10
 106          mftb    9
 107          mftbu   8
 108          cmpw    8,10
 109          bne-    0b
jww
  • 97,681
  • 90
  • 411
  • 885
  • Does AIX's assemble support numeric label names? You're using gcc, but it's probably not configured to use `gas`. – Peter Cordes Aug 16 '18 at 05:19
  • “0:” looks indented compared to “L..5:”. Is it indented? Why? If in the `__asm__` you change `0: \n\t` to `\n0: \n\t`, does the error go away? – Eric Postpischil Aug 16 '18 at 05:35
  • @Eric - I believe it is indented with a tab (not whitespace; according to emacs). I blocked all the C text/asm left margin and it produces the same result. Let me try your suggestion. (I've tried doing a lot of unusual things, but not a leading `\n` because the IBM docs don't show one like that). – jww Aug 16 '18 at 05:49
  • I agree that IBM article does mention using `0` as a label name. Is it possible it was written for Linux on PowerPC, or a newer version of AIX? Does editing the asm (the `.s` file) by hand to replace the label with a named label fix it? (Or edit the C source if you want, but experimenting with the `.s` would cut out the step of making sure you got the asm you were expecting using inline asm.) – Peter Cordes Aug 16 '18 at 05:54
  • @Eric - I looks like the same problem. I also removed leading whitespace, removed trailing whitespace like `0:\n\t"`, omitting the colon, and several other random things. – jww Aug 16 '18 at 05:56
  • Eric and Peter - It looks like a non-local label is one of the solutions. That is `\nagain:` and `bne- again \n\t`. I'm less sure about the choices: (1) switch to non-local labels and be done with it; or (2) continue with local labels because the code is written using them and it is supposed to work. – jww Aug 16 '18 at 05:59
  • How much do you care about the platform you're having a problem with? Can you expect users on that platform to configure their system to use a better assembler? Obviously you can't use `again:` because this asm could inline multiple times into the same file. But the easy solution with no downside is `again%=:` / `bne- again%=`, to have the compiler auto-number the label for each instance of the inline asm. Or maybe use `L..again%=` if that makes it a local label that won't go into debug symbols. – Peter Cordes Aug 16 '18 at 06:05
  • Eric and Peter - I think I found a better solution to this problem. We loose symbolic jump targets, but we gain something that just works on both AIX and Linux. – jww Nov 27 '18 at 05:19

2 Answers2

1

gcc doesn't emit machine-code directly; it feeds its asm output to the system assembler. You could configure gcc to use a different assembler, like GAS, but apparently the default setup on the machine you're using has GCC using AIX's assembler.

Apparently AIX's assembler doesn't support numeric labels, unlike the GNU assembler. Probably that article you linked is assume Linux (accidentally or on purpose) when it mentions using labels like 0.


The easiest workaround is probably having GCC auto-number the label instead of using local labels, so the same asm block can be inlined / unrolled multiple times in the same compilation unit without symbol-name conflicts. %= expands to a unique number in every instance.

IDK if an L.. makes it a file-local label (which won't clutter up debug info or the symbol table). On Linux/ELF/x86, .L is the normal prefix, but you have a compiler-generated L.. label.

__asm__ __volatile__ (
            "L..again%=:    \n\t"
            "mftbu   %0     \n\t"
            "mftb    %1     \n\t"
            "mftbu   %2     \n\t"
            "cmpw    %2,%0  \n\t"
            "bne-    L..again%="
            : "=r"(upper),"=r"(lower),"=r"(tmp)
            : :
            );

Or for this specific asm use-case, there might be a built-in function that reads the time-stamp registers which would compile to asm like this.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • *"... there might be a built-in function that reads the time-stamp registers which would compile to asm like this"* - Yeah, I was looking for that. I found GCC's `__builtin_ppc_get_timebase` and `__builtin_ppc_mftb`. I don't think they work with IBM's XLC. – jww Aug 16 '18 at 06:45
0

In addition to @Peter's answer, I just found this answer while working on How to have GCC combine “move r10, r3; store r10” into a “store r3”?. The other question hit the problem on AIX, too.

Here's the code from the other question that caused 1252-142 on AIX:

uint32_t val;
__asm__ __volatile__ (
    "1:                            \n"  // retry label
    #if __BIG_ENDIAN__
    ".byte 0x7c, 0x60, 0x05, 0xe6  \n"  // r3 = darn 3, 0
    #else
    ".byte 0xe6, 0x05, 0x60, 0x7c  \n"  // r3 = darn 3, 0
    #else
    "cmpwi 3,-1                    \n"  // r3 == -1?
    "beq 1b                        \n"  // again on failure
    "mr %0,3                       \n"  // val = r3
    : "=r" (val) : : "r3", "cc"
);

The solution is, don't use labels. Just use displacements:

uint32_t val;
__asm__ __volatile__ (
    // "1:                         \n"  // retry label
    #if __BIG_ENDIAN__
    ".byte 0x7c, 0x60, 0x05, 0xe6  \n"  // r3 = darn 3, 0
    #else
    ".byte 0xe6, 0x05, 0x60, 0x7c  \n"  // r3 = darn 3, 0
    #else
    "cmpwi 3,-1                    \n"  // r3 == -1?
    // "beq 1b                     \n"  // again on failure
    "beq .-8                       \n"  // again on failure
    "mr %0,3                       \n"  // val = r3
    : "=r" (val) : : "r3", "cc"
);

In the code above, I needed to jump back 2 instructions to re-execute darn 3, 0. Each instruction is 4-bytes, so the jump was -8. However, the jump target needs to be relocatable so the expression .-8 was used. The dot means "here".

And it works on both AIX and Linux.

jww
  • 97,681
  • 90
  • 411
  • 885