How does the C/C++ compiler distinguish the uses of the * operator (pointer, dereference operator, multiplication operator)?

Question

How, in C and C++ languages, can the compiler distinguish * when used as a pointer (MyClass* class) and when used as a multiply operator (a * b) or when is a dereferencing operator (*my_var)?

Wikipedia has an article on the classic method: [lexer hack](https://en.wikipedia.org/wiki/Lexer_hack). — user786653, Oct 08 '20 at 07:32
@user786653 at least in case of gcc that's no longer a thing. C++ is complex enough that classic lexers are inept. G++ uses a hand-written recursive descent parser ( so something similar to what was put into basis of Google translator) instead of bison-based one. But in general it depends on compiler implementation and only few expose that secret — Swift - Friday Pie, Oct 08 '20 at 07:40
The same way the compiler distinguishes `a & b` and `&var`, `+a` and `a + b`, or `&&a` and `a && b`: one is a **unary** and the other is a **binary** operator. In C++/CLI there are also `type ^` vs `a ^ b` and `type %` vs `a % b` — phuclv, Oct 08 '20 at 07:51
@Swift-FridayPie gcc and clang are open source so there are hardly any secrets here — M.M, Oct 25 '20 at 12:47
There is no actual ambiguity to resolve. It is alway clear from the grammar whether a unary or binary operator is intended, and it is always clear from the current parse context whether a declaration or a dereference is being written. — user207421, May 17 '21 at 07:20

Zig Razor · Accepted Answer · 2021-05-17T07:12:32.953

It depends from the context in which it is used, for a simple resolution it looks at the left and right word to understand what a symbol is.

The language's syntax is defined by a tree of grammatical productions that inherently imbue a priority or "precedence" to the application of certain operators over the application of other operators. This is particular handy when an expression might otherwise be ambiguous (because, say, two operators used are represented by the same lexical token).

But this is just lexing and parsing. Whether any particular operation is actually semantically valid is not decided until later in compilation; in particular, given two pointers x and y, the expression *x *y will fail to compile because you cannot multiply *x by y, not because there was a missing operator in what might otherwise have been a dereference followed by another dereference.

Further read at wikipedia page: Lexer_hack.

Other interesting read at this Lexer-Hack Enacademic link.

Jean-Marc Volle · Answer 2 · 2020-10-08T07:52:06.563

5

deferencing * operator is an unary operator so in trivial cases compiler will apply an implicit rule. eg

int a;
int *ptr = &a;
*ptr = 5;

multiplication operator * is a binary operator so in trivial cases compiler will apply multiplication provided the operands support it eg:

int a;
int b;
int c = a*b;

For more complex operations you might need to help the compiler understand what you mean by using parenthesis if the operators precedence is not enough eg:

  int a = 1;
  int b[2] = {2,3};
  int *aPtr = &a;
  int *bPtr = b;
  
  int c = *aPtr * *(bPtr+1);

edited Oct 08 '20 at 07:52

answered Oct 08 '20 at 07:42

Jean-Marc Volle

3,113
1
16
20

So don't say that if it's not true ;-). Typically for function pointers you need parentheses because the operator()() has precedence over operator*(), but that is not *-specific. In particular, it is not necessary in order to prevent confusion between dereferencing and multiplication. – Peter - Reinstate Monica Oct 08 '20 at 07:51
Yeah, it does; but again, there is no danger of confusion between the different semantics of the asterisk. I suppose that confusion is impossible because you cannot multiply pointers (I bet you you could in early -- pre-ANSI -- versions of C). – Peter - Reinstate Monica Oct 08 '20 at 07:56

How does the C/C++ compiler distinguish the uses of the * operator (pointer, dereference operator, multiplication operator)?

2 Answers2