4

In C++ class today, we discussed the maximum possible length of identifiers, and how the compiler will eventually stop treating variables as different, after a certain length. (My professor seems to have implied that really long identifiers are truncated.) I posted another question earlier, hoping to see if the limit is defined somewhere. My question here is a little different. Suppose I wanted to test either a practical or enforced limit on identifier name lengths. How would I go about doing so? Here's what I'm thinking of doing, but somehow it seems to be too simple.

  • Step 1: Generate at least two variables with really long names and print them to the console. If the identifier names are really that unlimited, I am not going to waste time typing them. My code should do it for me.
  • Step 2: Attempt to perform some operations with the variables, such as compare them, or any arithmetic. If the compiler stops differentiating, then in theory, certain arithmetic will break, such as x/(reallyLongA-reallyLongB), since reallyLongA and reallyLongB will be so long that the compiler will just treat them as the same thing. At that point, the division operation will become a division-by-zero, which should crash and burn horribly.

Am I approaching this correctly? Will I run out of memory before I "break" the compiler or "runtime"?

Community
  • 1
  • 1
Moshe
  • 57,511
  • 78
  • 272
  • 425
  • I think the answer to your other question indicated that there was fundamentally no limit. I think you're going to crash the compiler or run out of RAM before you get your desired behaviour. – user229044 Sep 13 '11 at 04:59

4 Answers4

5

I don't think you need to even generate any operations on the variables.

The following code will generate a redefinition error at compilation time;

int name;
int name;

I'd expect you'd get the same error with

int namewithlastsignificantcharacterhere_abc;
int namewithlastsignificantcharacterhere_123;

I'd use a macro scripting language to generate successively longer names until you got one that broke. Here's a Ruby one-liner

C:>ruby -e "(1..2048).each{|i| puts \"int #{'variable'*i}#{i};\"}" > var.txt

When I #include var.txt in a c file, and compile with VS2008, I get the error

"1>c:\code\quiz\var.txt(512) : fatal error C1064: compiler limit : token overflowed internal buffer"

and 512*8 chars is the 4096 that JRL cited.

AShelly
  • 34,686
  • 15
  • 91
  • 152
  • In Xcode 4, running a program with your generated output causes some really slow movement in the IDE, but the code runs. – Moshe Sep 13 '11 at 16:07
4

Your professor is wrong. § 2.11/1 of the C++ standard says: "All characters are significant". Certainly compilers may impose a limit on the allowed length, as noted in your other question. That doesn't mean they can ignore characters after that.

He's probably confusing C and C++. The two languages have similar but not identical rules. Historically, C had limits as low as six significant characters.

As for your test, there's a far simpeler way to test your hypothesis. Note that

int a;
int a;

is illegal, because you define the same identifier twice. Now if ReallyLongNameA and ReallyLongNameB would differ only in non-significant characters, then

int ReallyLongNameA;
int ReallyLongNameB;

would also be a compile-time error, because both would declare the same variable. You don't need to run the code. You can just generate test.cpp with those two lines, and try to compile it. So, write a small test program that creates increasingly long identifier names, write them to test.cpp, and call system("path/to/compiler -compileroptions test.cpp"); to see if it compiles.

MSalters
  • 173,980
  • 10
  • 155
  • 350
3

For Windows C++:

Only the first 2048 characters of Microsoft C++ identifiers are significant. Names for user-defined types are "decorated" by the compiler to preserve type information. The resultant name, including the type information, cannot be longer than 2048 characters.

Thus seems you could do a pretty simple test using a MS compiler, at least.

Edit: Didn't do extensive testing, but on my Visual Studio Pro 2008 at least, a variable named aaaa... (total length 4095 characters) compiles, and after that (>= 4096 you get Fatal Error C1064: compiler limit : token overflowed internal buffer).

JRL
  • 76,767
  • 18
  • 98
  • 146
  • GREAT find! I happen to be on Mac OS at the moment, but I'm excited that someone, somewhere, has some hard limit defined. It's a good reference point. – Moshe Sep 13 '11 at 05:09
1

I would assume that if it still works after the length reaches some ridiculous size (like > 1MB), that the compiler probably is able to handle arbitrary sized identifiers.

Of course there's no sure way to tell as it is entirely possible for the identifier length limit to exceed the amount of memory you have. (a limit of 2^32 - 1 is entirely possible)

Mysticial
  • 464,885
  • 45
  • 335
  • 332