1

Take this code:

int main(void)
{
    int var$ = 3;
    printf("%d\n",var$);
}

This compiles properly (GCC, Clang, MSVC) and prints 3 upon execution as expected.

However, this code:

int main(void)
{
    int var@ = 8;
    printf("%d\n",var@);
}

This does not compile (GCC, Clang, MSVC), with the error stray '@' in program.

Looking at the C/C++ Operator List (Ctrl+F'ing for @ and $), neither of them are operators.

Why is var$ valid but var@ isn't?

DEADBEEF
  • 525
  • 1
  • 5
  • 19
  • 3
    It may be an unsubstantiated rumour, but I believe that VMS (which used to be used as an o/s on DEC VAX machines) had system calls etc that embedded `$` in the names, so compilers that ran on VMS needed to allow `$` as a valid symbol in identifiers. This may have been carried over more generally. It allows you to write non-portable and obscure code. _…time passes…or it may be a [substantiated](http://stackoverflow.com/a/43955862/) [rumour](http://stackoverflow.com/a/43955894/)!…_ – Jonathan Leffler May 13 '17 at 17:13
  • @JonathanLeffler: Surely [there's proof out there somewhere](http://h41361.www4.hpe.com/docs/base_doc/DOCUMENTATION/V40F_PDF/AQTLTBTE.PDF#page=27). – l'L'l May 13 '17 at 18:32

3 Answers3

4

Taking a look at the C11 Specification, Section 6.4.2 on Identifiers:

Semantics

2 An identifier is a sequence of nondigit characters (including the underscore _, the lowercase and uppercase Latin letters, and other characters) and digits, which designates one or more entities as described in 6.2.1. Lowercase and uppercase letters are distinct. There is no specific limit on the maximum length of an identifier.

3 Each universal character name in an identifier shall designate a character whose encoding in ISO/IEC 10646 falls into one of the ranges specified in D.1.71) The initial character shall not be a universal character name designating a character whose encoding falls into one of the ranges specified in D.2. An implementation may allow multibyte characters that are not part of the basic source character set to appear in identifiers; which characters and their correspondence to universal character names is implementation-defined.

(emphasis mine)

And per the GCC Manual on Implementation-defined behavior:

  • Identifier characters.

  The C and C++ standards allow identifiers to be composed of ‘_’ and the alphanumeric characters. C++ also allows universal character names. C99 and later C standards permit both universal character names and implementation-defined characters.

  GCC allows the ‘$’ character in identifiers as an extension for most targets. This is true regardless of the std= switch, since this extension cannot conflict with standards-conforming programs. When preprocessing assembler, however, dollars are not identifier characters by default.

(emphasis mine)

And again in Tokenization, and mentioned in Idav1s's answer:

As an extension, GCC treats ‘$’ as a letter. This is for compatibility with some systems, such as VMS, where ‘$’ is commonly used in system-defined function and object names. ‘$’ is not a letter in strictly conforming mode, or if you specify the -$ option.

Because VMS used many system-defined functions and objects that were named with $, GCC allowed $ to be included as an implementation-specific character for compatibility on some systems.

Special characters such as $ and @ aren't explicitly allowed by C specification to be in an identifier, but certain characters (such as $ here) may be allowed by implementation. GCC, for example allows $ in identifiers for most targets. The same goes for Clang (as most of its implementation-defined behavior is the same as GCC), and MSVC.

Community
  • 1
  • 1
Andrew Li
  • 55,805
  • 14
  • 125
  • 143
  • Re: "Special characters such as `$` and `@` aren't allowed by specification to be in an identifier". This is a bit confusing in relation to the `VMS` part you mentioned before, as I know that `DEC C` allows `$` to be in identifiers, although `@` is not. – l'L'l May 13 '17 at 18:27
  • @I'L'l Yeah, I agree. My intention was to say that in the specification, only alphanumeric and _ characters are allowed in identifiers, but certain implementations may define certain other characters to allow. I'll edit to make it clearer – Andrew Li May 13 '17 at 18:29
  • Forgive me for not understanding, but *why* does GCC accept `@` but not `$`? – DEADBEEF May 13 '17 at 19:28
  • @0xBADC0DE GCC doesn't accept `@` but accepts `$` because the GCC implementation allows for that certain character. Since GCC wanted compatibility, they allowed for `$` in identifiers, but not `@`. – Andrew Li May 13 '17 at 19:29
2

Blame VMS:

As an extension, GCC treats ‘$’ as a letter. This is for compatibility with some systems, such as VMS, where ‘$’ is commonly used in system-defined function and object names. ‘$’ is not a letter in strictly conforming mode, or if you specify the -$ option. See Invocation.

I modified your var$ snippet slightly (to eliminate warnings we don't care about) and added strictly conforming flags here:

$ gcc -Wall -std=gnu99  -ansi -pedantic -Werror  -O2 -o a.out source_file.c

Error(s):
source_file.c: In function ‘main’:
source_file.c:4:13: error: '$' in identifier or number [-Werror]
         int var$ = 3;
             ^
cc1: all warnings being treated as errors

So if gcc is given the right flags, $ won't compile either.

ldav1s
  • 15,885
  • 2
  • 53
  • 56
1

Well, strictly speaking, var$ isn't valid, either.

Some compilers let you use $ as a character in identifiers, along with a-z, A-Z, 0-9, and _. This is a nonstandard extension. It's useful because compilers for some other languages (I think including FORTRAN) permit $ in identifiers, and you might be trying to write C code that interoperates with those other languages. Evidently gcc is one of the compilers that permits this extension.

An extension permitting @ would be much more unusual. I might have seen a weird compiler once that let you use it, but since there's no use for it that I know of, and less demand for it, I guess nobody's offering it.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103