Is the C++ compiler really smart enough to distinguish between multiply and dereference?

Question

I have the following line of code:

double *resultOfMultiplication = new double(*num1 * *num2);

How does the compiler know which * is used for derefencing and which * is used for multiplication?

Also, and probably a more important question is in this case is double a primitive (like in Java) or an object? If it's a primitive how can I create a new one?

Looks like a rewrite of your code is required to get rid of all the pointers. A simple `double result = num1 * num2;` might be good enough. — Sjoerd, Feb 12 '12 at 01:06
`How does the compiler know which * is used for derefencing and which * is used for multiplication?` How else could it be? — Lightness Races in Orbit, Feb 12 '12 at 01:09
Please pick up a good book and learn C++ properly. I strongly recommend you unlearn everything you seem to think is C++. Real C++ should hardly have any raw pointers or `new`. — Kerrek SB, Feb 12 '12 at 01:12
Tip: forgot everything you learned about objects in Java. C++ works differently. Even the term 'object' is defined differently in C++ than in Java. — Sjoerd, Feb 12 '12 at 01:14

score 4 · Answer 1 · answered Feb 12 '12 at 01:19

The compiler doesn't need to be "smart"; the language's syntax is defined by a tree of grammatical productions that inherently imbue a priority or "precedence" to the application of certain operators over the application of other operators. This is particular handy when an expression might otherwise be ambiguous (because, say, two operators used are represented by the same lexical token).

But this is just lexing and parsing. Whether any particular operation is actually semantically valid is not decided until later in compilation; in particular, given two pointers x and y, the expression *x *y will fail to compile because you cannot multiply *x by y, not because there was a missing operator in what might otherwise have been a dereference followed by another dereference.

I shan't go into an in-depth proof that operator precedence exists in C++; for that, just take a basic course in syntax structure and you'll grok it soon enough.

score 4 · Answer 2 · answered Feb 12 '12 at 01:29

Maybe an analogy will help:

Q: How do humans tell the dot above the 'i' apart from the dot at the end of of a sentence? Are they really that smart that they don't interpret each and every 'i' as the end of the sentence?

A: Because they are in different locations!

Same is for the '*' for the compiler: they appear in different positions. The multiplication operator stands in the middle of two expressions; The dereferencing operator stands in front of an expression. It may not be obvious to you, but it is obvious to a compiler.

Any decent text on parsing will tell you how compilers are able to do this. The required technology was developed about 40 years ago, and is considered to be among the most basic things in a compiler. A C++ compiler has to have many smart parts, but this is not one of them.

Note to experts: I am aware of factors, lvalues, and so on. But they will only confuse in this case.

score 3 · Answer 3 · answered Feb 12 '12 at 01:19

3

It's all about the grammar. There's no postfix *, so * after an identifier has to be treated as infix multiply.

answered Feb 12 '12 at 01:19

Ben Voigt

277,958
43
419
720

score 1 · Answer 4 · answered May 21 '17 at 15:54

Since the question is about “smartness”, I would like to add one point. My answer will refer to the C language, but I assume the situation is identical in C++.

The compiler does not need to really be smart in this case, simply because the language does not allow to perform multiplications directly on addresses, so the symbol * preceding a pointer can always mean only “dereference”, while the symbol * preceding a non-pointer can only mean “multiplication”.

I will try to explain this with an example.

Let's create a small program and let's call it test.c. Let's then create within the main() function the two pointers first and second and let the address of first be 140732806008300 and the address of second be 140732806008296.

When we try to sum the two addresses the compiler allows us to peacefully use the + operand without a previous cast, because in C the sum of pointers is allowed:

#include <stdio.h>

int main () {

    int *first, *second, fifteen = 15, twenty = 20;

    first = &fifteen;   /* Let the address of `first` be 140732806008300 */
    second = &twenty;   /* Let the address of `second` be 140732806008296 */

    printf("first + second is: %llu\n", (long long unsigned int) first + second);

    return 0;

}

What we did here was to directly perform a sum of two pointers, getting a new pointer as a result, which then was casted into an unsigned integer for the sake of the prinf() function. Since this is allowed, we correctly get the string

first + second is: 703664030041496

But if instead we try to multiply the two addresses using the multiplication operand *…

#include <stdio.h>

int main () {

    int *first, *second, fifteen = 15, twenty = 20;

    first = &fifteen;   /* Let the address of `first` be 140732806008300 */
    second = &twenty;   /* Let the address of `second` be 140732806008296 */

    printf("first * second is: %llu\n", (long long unsigned int) first * second);

    return 0;

}

…we get the following error:

test.c: In function ‘main’:
test.c:11:69: error: invalid operands to binary * (have ‘long long unsigned int’ and ‘int *’)
  printf("first * second is: %llu\n", (long long unsigned int) first * second);

This is because direct multiplication of addresses is not allowed. Therefore we have to cast the pointers into valid integers before being able to eventually use the symbol * with the meaning of “multiplication operand”:

#include <stdio.h>

int main () {

    int *first, *second, fifteen = 15, twenty = 20;

    first = &fifteen;   /* Let the address of `first` be 140732806008300 */
    second = &twenty;   /* Let the address of `second` be 140732806008296 */

    printf("first * second is: %llu\n", (long long unsigned int) ((long unsigned int) first) * ((long unsigned int) second));

    return 0;

}

…Now we finally have got the symbol * standing between integers (and not between pointers). And in such context it can only mean “multiplication” (and nothing else), so we finally get the correct result of the operation:

first * second is: 4480243502683625952

I tried to think about examples where the meaning of the symbol * cannot be disambiguated (by humans as well), but without success.

This means that the symbol * can unambiguously mean only one thing in a given context – i.e., there are cases where we have to use parentheses to change its meaning, but its meaning is always unambiguous in its context.

score 0 · Answer 5 · answered Oct 09 '20 at 13:28

0

It depends from the context in which it is used, for a simple resolution it looks at the left and right word to understand what a symbol is.

Further read at wikipedia page: Lexer_hack

answered Oct 09 '20 at 13:28

Zig Razor

3,381
2
15
35

score 0 · Accepted Answer · edited Feb 12 '12 at 01:20

0

Dereference comes first, so *num1 * *num2 gets parsed as (*num1) * (*num2), which is unambiguous. *resultOfMultiplication is not parsed as dereference because it is a variable definition. In such a context, the compiler expects a data type followed by an identifier so the asterisk is unambiguous.

Primitive data types are still objects in C++. If you use new on a primitive type, all that happens is that enough memory in the free store to hold the object is allocated and its address returned to you. This is unlike 'normal' variables (i.e. double t;), which are of automatic or static storage duration.

edited Feb 12 '12 at 01:20

Lightness Races in Orbit

378,754
76
643
1,055

answered Feb 12 '12 at 01:05

Alexander Gessler

45,603
7
82
122

1

Or automatic storage duration. – Lightness Races in Orbit Feb 12 '12 at 01:10
2

And _instances of_ primitive data types absolutely _are_ objects in C++. – Lightness Races in Orbit Feb 12 '12 at 01:13
2

Primitive types are objects in C++. They're not class types, of course, but they *are* objects. – jalf Feb 12 '12 at 01:19
3

This is just silly. The dereference operator can be applied multiple times, so operator precedence can't explain it. – Ben Voigt Feb 12 '12 at 01:21
You're right, of course. Unfortunately I can't delete my own answer now that it is accepted, but maybe the OP wants to re-consider his decision. – Alexander Gessler Feb 12 '12 at 01:22

Is the C++ compiler really smart enough to distinguish between multiply and dereference?

6 Answers6