What does the compiler (in C) do when converting a string into an integer?

Question

Let's start with this example:

#include <stdio.h>

int main()
{
    int x = "string";
    printf("%d", x);
}

Ouput -> "12221232"

I know this code example is syntactically correct semantically false. I just don't know what happens here exactly. The text gets converted into an integer, but how? Can someone please explain?

The string literal is actually an array (`char[7]`). In most contexts, use of an array is converted to a pointer to its 1st element (so "address of the `s`"). Assigning an address to an `int` object is UB... but usually does the "right thing" on most platforms — pmg, Jan 12 '21 at 11:44

score 3 · Answer 1 · answered Jan 12 '21 at 11:42

3

The value you observe is an address where "string" is stored. The literal "string" is actually cast to a pointer of type const char * that points to the actual data. This pointer (address of "string" data) is next cast to int which you observe as output of printf().

answered Jan 12 '21 at 11:42

tstanisl

13,520
2
25
40

1

A cast is an explicit operator in source code, a type name in parentheses. It specifies that a conversion should be performed. When the C implementation automatically converts an array to a pointer, that is just a conversion, not a cast. Also, in C, a string literal is an array of `char`, not `const char`, so the resulting pointer is `char *`, not `const char *`, even though the behavior of modifying the elements is not defined. – Eric Postpischil Jan 12 '21 at 11:51
It should is also not clear that the value printed is a representation of the address of the string, and it should be noted this may not be relied upon. If the C implementation uses eight bytes for `char *` and four bytes for `int`, then the value printed by this code is more likely four bytes out of the full address, not the actual address, although other results are also possible because the conversion from `char *` to `int` is, per C 2018 6.3.2.3 6, undefined if the value is not representable in `int`. – Eric Postpischil Jan 12 '21 at 11:52
@EricPostpischil No conversion happens in this case since the program is ill-formed. It's a constraint violation of 6.5.16.1 simple assignment. The code doesn't sate the requirement "the left operand has atomic, qualified, or unqualified arithmetic type, and the right has arithmetic type;" so the code isn't valid C. – Lundin Jan 12 '21 at 15:04
@Lundin: C implementations may successfully translate the code. Since OP obtained output, they did compile and execute the program. – Eric Postpischil Jan 12 '21 at 15:51

score 2 · Accepted Answer · answered Jan 12 '21 at 12:04

In int x = "string";, two things of note happen:

The array represented by "string" is automatically converted to a pointer to its first element. So this is some address in memory, of type char *.
The address is converted to an int.

This conversion is specified by C 2018 6.3.2.3 6, which says:

Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined…

Footnote 69 tells us the implementation-defined conversion is supposed to be “consistent” with the addressing structure of the environment the program runs in. Most modern systems use a “flat” address space, so the result of converting an address to an int will be the address as you may know it from seeing addresses in the debugger. But that is provided the address fits in an int.

If your system uses eight-byte addresses and four-byte int, then many addresses will not fit in an int. In this case, the behavior is not defined. A common behavior is for C implementation to use just four bytes of the address to make the int. Alternatively, it might set the int to all zeroes or all ones, or it might leave “garbage” in the int, or it might trap.

mhawke · Answer 3 · 2021-01-12T11:51:04.303

0

It's not converting the text to an integer, it is printing the value memory location of the literal string. Try this:

#include <stdio.h>

int main()
{
    int x = "string";
    printf("x = %d\n", x);
    printf("string = %d\n", "string");
}

If you run this you'll see that x and the literal "string" have the same memory address.

Sample output

x = 4202512
string = 4202512

Although as @pmg commented, this is not necessarily the case.

In summary, what you are seeing is that the address of the string "string" is being assigned to x, which is where x is getting this value.

edited Jan 12 '21 at 11:51

answered Jan 12 '21 at 11:43

mhawke

84,695
9
117
138

Not necessarily. Different string literals, even if equal, can live in different memory spaces. Also, to print addresses, use `printf("%p", "string");` – pmg Jan 12 '21 at 11:45
@pmg: but for the purpose of illustration in this case they do. – mhawke Jan 12 '21 at 11:47

score 0 · Answer 4 · answered Jan 12 '21 at 15:01

int x = "string"; is a constraint violation of simple assignment, see "Pointer from integer/integer from pointer without a cast" issues.

So the code is not valid C, has never been valid C and will not compile cleanly, see What must a C compiler do when it finds an error?. Speculating about why an invalid, non-standard C program prints something isn't very meaningful.

What does the compiler (in C) do when converting a string into an integer?

4 Answers4