What are identifiers in C exactly?

Question

Every Google search explains them as just "names for your variables", but I have a feeling there is a distinction between the identifier and the identifier's name. Is an identifier more like an object with attributes like name, scope, linkage, and an underlying object? I ask this because I ran into some trouble trying to read through the C standard. For instance, the snippet

int main(){
  int x;
  extern int x;
}

fails to compile whereas

int main(){
  int x;
  if(1){extern int x;}
}

compiles successfully. In this question, the failure of the first snippet is explained from 6.2.2.6 in the C standard, which states that local variables have no linkage. However, in the second snippet, the local variable still has no linkage and yet there is no conflict. Now, 6.2.2.4 states

For an identifier declared with the storage-class specifier extern in a scope in which a prior declaration of that identifier is visible, if the prior declaration specifies internal or external linkage, the linkage of the identifier at the later declaration is the same as the linkage specified at the prior declaration. If no prior declaration is visible, or if the prior declaration specifies no linkage, then the identifier has external linkage.

My explanation would have been that this rule is in effect in both snippets, but in the first one, the uniqueness of the underlying object of x triggers a constraint violation via 6.2.1.2 because the same identifier name is being used for two distinct objects with the same scope and name space. But this is not the explanation given in the answer to the question I linked earlier. In the second snippet, the linkage types are still conflicting, so does changing the scope of the extern declaration change the visibility of the local declaration? What is the best way to think about linkage from the abstract point of view of the C standard (without using actual implementations like gcc or clang as illustration)?

6.2.2.6 just says that local identifiers without the `extern` storage-class specifier have no linkage. — Barmar, Nov 15 '20 at 06:17
Linkage refers to names in a scope. The second example compiles, because the name `x` in the scope of the `if` block is different from `x` in the enclosing `main` body. For an example, see [this snippet](https://stackoverflow.com/a/45724518/5538420). — dxiv, Nov 15 '20 at 06:49
@dxiv I am trying to figure out why the declaration in the if block refers to a different entity according to the C standard. Why would this entity be different from one declared as `static int x;` outside of `main`? — GuPe, Nov 15 '20 at 07:37
@GuachoPerez For the same reason that `if(1) {int x;}` masks the outer `x`. And if you added a `static int x;` at the file level, that would once again be an error because of conflicting linkage. — dxiv, Nov 15 '20 at 07:51
@dxiv but adding the `static int x;` outside `main` in the second snippet compiles fine. How does the compiler know that `extern int x` is different from `int x` but not from `static int x`? — GuPe, Nov 15 '20 at 07:57
@GuachoPerez It doesn't compile [here](https://godbolt.org/z/PdWMdq). — dxiv, Nov 15 '20 at 08:03
@dxiv alright, so what is happening is that the if block extern is seeing the prior declaration with no linkage and therefore sets its entity to external linkage. But the static declaration sets it to internal linkage and therefore there is undefined behavior? — GuPe, Nov 15 '20 at 08:09
@GuachoPerez On a second thought, not so sure about that, see [my other comment](https://stackoverflow.com/questions/64841679/what-are-identifiers-in-c-exactly?noredirect=1#comment114642900_64841916). — dxiv, Nov 15 '20 at 08:43

M.M · Accepted Answer · 2020-11-15T07:11:24.880

5

"identifier" is an element of the language grammar. After preprocessing, all tokens are one of the following: keyword, identifier, constant, string-literal or punctuator.

If a token starts with a letter (or underscore) it can only be a keyword or an identifier. If it's not in the table of keywords then it is an identifier. For more technical detail on this , see Annex A of the C Standard.

In your program x and main are identifiers, int, if and extern are keywords, 1 is a constant, and everything else is a punctuator.

Identifiers are used as names of entities. The same identifier can be used in different scopes to designate different entities (or the same entity). Linkage is the name of the process by which identifiers are associated with entities.

Sometimes the standard uses the word "identifier" to mean the entity identified by an identifier, this is covered in 6.2.1/5:

Unless explicitly stated otherwise, where this International Standard uses the term “identifier” to refer to some entity (as opposed to the syntactic construct), it refers to the entity in the relevant name space whose declaration is visible at the point the identifier occurs.

The first code is erroneous because of 6.7/3:

If an identifier has no linkage, there shall be no more than one declaration of the identifier (in a declarator or type specifier) with the same scope and in the same name space, except that: [...]

The int x; has no linkage so there shall not be another definition of x in the same scope. (The list of exceptions does not have anything relevant to this case).

In the second code, 6.7/3 is not violated because the second declaration is not in the same scope as the first one. The text you quoted explains that extern int x; names a different entity than int x; did, which is fine.

The second program has undefined behaviour (no diagnostic required) due to declaring an identifier with external linkage but not providing a definition. You may or may not see an error message.

edited Nov 15 '20 at 07:11

answered Nov 15 '20 at 06:57

M.M

138,810
21
208
365

Thank you this us really helpful. I only have some questions regarding the 2nd to last paragraph. User @dxiv also mentioned that in the second code, both declarations refer to different entities. Is this due to the second declaration hiding the first as per 6.2.1.4 and therefore, the "no prior definition is visible" part of 6.2.2.4 holds, forcing the second identifier to be external? I imagine this is not the full picture since if this were true, then 6.2.2.4 would never make an external declaration have internal linkage. – GuPe Nov 15 '20 at 07:33
@GuachoPerez In the second code there is a prior declaration visible of the same identifier . Which is why the text you quoted in your question applies. The prior declaration of `x` has no linkage, and the new declaration of `x` has external linkage and names a different entity to the first `x`. – M.M Nov 15 '20 at 07:43
So when I add `static int x;` outside of `main` in the second code, how does 6.2.2.4 apply? The local declaration would still be visible so the `extern` in the `if` block would cause `x` to have external linkage while the `static` would cause it to have internal linkage. Would this result in undefined behavior or would `x` adopt internal linkage and equal the entity declared in field scope? – GuPe Nov 15 '20 at 08:03
@GuachoPerez there's another clause saying the behaviour is undefined if an identifier appears with both external and internal linkage in the same translation unit – M.M Nov 15 '20 at 08:15
@M.M I'll admit that I find the wording of 6.2.2/4 rather confusing. "*For an identiﬁer declared with the storage-class speciﬁer extern ... if the prior declaration specifies* ***internal or*** *external linkage, the linkage of the identifier at the later declaration is* ***the same as the linkage specified at the prior declaration***". This can be read as saying that `static int x;` followed by `extern int x;` is allowed, and the latter "inherits" the internal linkage from the former. Though that doesn't appear to be the case. – dxiv Nov 15 '20 at 08:37
@dxiv I think that if `extern` were in the first nest of `main` it would catch the internal linkage from the `static` outside main, making `x` internally linked. With two nests, the intervening no linkage local variable forces the extern to choose external linkage and cause a problem with the internal linkage declared in `static`. – GuPe Nov 15 '20 at 08:50
Just to recap for my future self. The identifier in the first declaration has no linkage by 6.2.2/6 and thus referst to a unique entity. The first snippet fails because of 6.7/3. In the second snippet, the second declaration specifies an identifier with external linkage because of 6.2.2/4 and thus the same identifier designates two different entities with strictly nested scopes, which causes no conflict. Is it possible that 6.7/3 is redundant since having the same identifier refer to two distinct entities with the same scope violates 6.2.1/2? – GuPe Nov 15 '20 at 20:30
Your recap is correct. The first code doesn't violate 6.2.1.2 (if we suppose that 6.7/3 didn't exist) because 6.2.2.4 would then say that the second declaration of x declares the same entity as the first one, as you initially thought before you found out about 6.7/3 . So the name is not used for two different entities. – M.M Nov 15 '20 at 20:54
@M.M Off-topic here, but interestingly enough C++ which had virtually the same wording about this and considered it ill-formed, now seems to be moving to recast it as being allowed with the interpretation that the inner `extern` references the outer `static` and so "inherits" its internal linkage (details [here](https://stackoverflow.com/a/64869151/5538420)). – dxiv Nov 17 '20 at 04:31

What are identifiers in C exactly?

1 Answers1

Linked