42

Since C# supports Int8, Int16, Int32 and Int64, why did the designers of the language choose to define int as an alias for Int32 instead of allowing it to vary depending on what the native architecture considers to be a word?

I have not had any specific need for int to behave differently than the way it does, I am only asking out of pure encyclopedic interest.

I would think that a 64-bit RISC architecture could conceivably exist which would most efficiently support only 64-bit quantities, and in which manipulations of 32-bit quantities would require extra operations. Such an architecture would be at a disadvantage in a world in which programs insist on using 32-bit integers, which is another way of saying that C#, becoming the language of the future and all, essentially prevents hardware designers from ever coming up with such an architecture in the future.

StackOverflow does not encourage speculating answers, so please answer only if your information comes from a dependable source. I have noticed that some members of SO are Microsoft insiders, so I was hoping that they might be able to enlighten us on this subject.

Note 1: I did in fact read all answers and all comments of SO: Is it safe to assume an int will always be 32 bits in C#? but did not find any hint as to the why that I am asking in this question.

Note 2: the viability of this question on SO is (inconclusively) discussed here: Meta: Can I ask a “why did they do it this way” type of question?

Community
  • 1
  • 1
Mike Nakis
  • 56,297
  • 11
  • 110
  • 142
  • 5
    If you want a "machine-word-sized type", you're looking for `System.IntPtr`. – R. Martinho Fernandes Dec 24 '11 at 18:07
  • Yes, but we tend to do arithmetic with `int`s, not with `IntPtr`s. – Mike Nakis Dec 24 '11 at 18:07
  • 2
    Yeah, I hope those downvoting have already seen the discussion on meta that I linked to at the end of the question. – Mike Nakis Dec 24 '11 at 18:09
  • 3
    @Mike, your first comment was right on the money: *we tend to do arithmetic with `int`s*. The CLR designers thought it was desirable for arithmetic operations to behave the same on all supported platforms, which would arguably not be the case if the size (and, therefore, range) of `int` was platform-dependent. – Frédéric Hamidi Dec 24 '11 at 18:11
  • The question is rather: why did they choose to have it system dependent if the integer is encoded as little or big endian? Java has a long history of just using big endian. Choosing little endian would have been a bad choice *in my opinion*, but it would at least have been a choice. – Maarten Bodewes Aug 04 '21 at 21:49
  • The libraries that come with a programming language may or may not do something with endianness, but neither the Dot Net runtime nor the Java runtime does anything with it as far as I know, besides being able to report the endianness of the CPU and being able to convert between integers of different endianness, should you, as a programmer, choose to perform such conversions. The built-in binary serialization mechanism of Java might be the only place where Java actually does something with endianness, and this is transparent to us. – Mike Nakis Aug 05 '21 at 10:54
  • The core libraries make up a large part of what makes a language, and Java's core classes always uses big endian encoding by default, while the MS libraries use the platform default. It's only because they are almost never used on big endian machines. Most larger C# applications would come crushing down otherwise, because of interop issues, and sometimes using little endian explicitly and otherwise implicitly. – Maarten Bodewes Aug 05 '21 at 11:15

3 Answers3

36

I believe that their main reason was portability of programs targeting CLR. If they were to allow a type as basic as int to be platform-dependent, making portable programs for CLR would become a lot more difficult. Proliferation of typedef-ed integral types in platform-neutral C/C++ code to cover the use of built-in int is an indirect hint as to why the designers of CLR decided on making built-in types platform-independent. Discrepancies like that are a big inhibitor to the "write once, run anywhere" goal of execution systems based on VMs.

Edit More often than not, the size of an int plays into your code implicitly through bit operations, rather than through arithmetics (after all, what could possibly go wrong with the i++, right?) But the errors are usually more subtle. Consider an example below:

const int MaxItem = 20;
var item = new MyItem[MaxItem];
for (int mask = 1 ; mask != (1<<MaxItem) ; mask++) {
    var combination = new HashSet<MyItem>();
    for (int i = 0 ; i != MaxItem ; i++) {
        if ((mask & (1<<i)) != 0) {
            combination.Add(item[i]);
        }
    }
    ProcessCombination(combination);
}

This code computes and processes all combinations of 20 items. As you can tell, the code fails miserably on a system with 16-bit int, but works fine with ints of 32 or 64 bits.

Unsafe code would provide another source of headache: when the int is fixed at some size (say, 32) code that allocates 4 times the number of bytes as the number of ints that it needs to marshal would work, even though it is technically incorrect to use 4 in place of sizeof(int). Moreover, this technically incorrect code would remain portable!

Ultimately, small things like that play heavily into the perception of platform as "good" or "bad". Users of .NET programs do not care that a program crashes because its programmer made a non-portable mistake, or the CLR is buggy. This is similar to the way the early Windows were widely perceived as non-stable due to poor quality of drivers. To most users, a crash is just another .NET program crash, not a programmers' issue. Therefore is is good for perception of the ".NET ecosystem" to make the standard as forgiving as possible.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • Well, I am sure that in the end it somehow boils down to portability, but I fail to see exactly how. You see, we very rarely perform arithmetic operations the outcome of which really depends on the specific bit-length of our operands, and for those rare cases when we actually do perform such operations, the language already provides us with prefectly fine Int8, Int16, Int32, Int64, and unsigned versions thereof. – Mike Nakis Dec 24 '11 at 18:17
  • Furthermore, the compiler/JIT could very easily provide us with differently compiled versions of our program for 4-byte machine word size and 8-byte machine word size, so that we can test it under both scenarios on a single development machine. – Mike Nakis Dec 24 '11 at 18:24
  • @MikeNakis That seems a good assumption except that it breaks down if you code for 64 bit, and then try to run on 32 bit, assumptions about the min and max integer can cause your code to break. It may also break custom code written to parse / format integers as strings. Remember the idea of these environments is more guaranteed portability than anything else. – Philip Couling Dec 24 '11 at 18:44
  • @couling if you are coding for 64 bit, use Int64. Also, please read my comment beginning with "Furthermore" above. – Mike Nakis Dec 24 '11 at 18:49
  • @MikeNakis "the compiler/JIT could very easily provide us with differently compiled versions of our program for 4-byte machine word size and 8-byte machine word size, so that we can test it under both scenarios on a single development machine" This would force an additional testing effort onto the software shops using .NET. More often than not, programmers would simply ignore (gasp) this testing as optional. To put it mildly, the level to which software is tested varies to a quite large extent even today; expecting that programmers would always test for portability is rather optimistic. – Sergey Kalinichenko Dec 24 '11 at 18:59
  • 5
    I should have been more clear. `int` is in general the default data type many developers use. A developer failing to remember that it *could* be doubled to 64 bit wont hurt (as you say). Failing to remember that it *could* be halved to 32 bit IS dangerous. – Philip Couling Dec 24 '11 at 19:01
  • 1
    @couling yes, I see your point. And it also comes to dasblinkenlight's words. Well, dasblinkenlight, this is an excellent answer. Lets sleep on it for a couple of days, and unless the question ends up getting Skeeted or Lipperted, I will accept your answer. – Mike Nakis Dec 24 '11 at 19:19
7

Many programmers have the tendency to write code for the platform they use. This includes assumptions about the size of a type. There are many C programs around which will fail if the size of an int would be changed to 16 or 64 bit because they were written under the assumption that an int is 32 bit. The choice for C# avoids that problem by simply defining it as that. If you define int as variable depending on the platform you by back into that same problem. Although you could argue that it's the programmers fault of making wrong assumptions it makes the language a bit more robust (IMO). And for desktop platforms a 32 bit int is probably the most common occurence. Besides it makes porting native C code to C# a bit easier.

Edit: I think you write code which makes (implicit) assumptions about the sizer of a type more often than you think. Basically anything which involves serialization (like .NET remoting, WCF, serializing data to disk, etc.) will get you in trouble if you allow variables sizes for int unless the programmer takes care of it by using the specific sized type like int32. And then you end up with "we'll use int32 always anyway just in case" and you have gained nothing.

ChrisWue
  • 18,612
  • 4
  • 58
  • 83
  • Please see my first comment to @dasblinkenlight's answer. – Mike Nakis Dec 24 '11 at 18:20
  • Even serialization to/from a binary medium could be taken care of by means of attributes, so for example I could declare `i` to be a native int, and I could attach an attribute to it stating that it is meant to be serialized as a little endian 32-bit quantity. But in any case, I see merit in the last part of what you said, that "then you end up with 'we'll use int32 always anyway just in case' and you have gained nothing". So, the bottom line is, I wonder whether Microsoft conducted any kind of research and determined that this would inevitably end-up being the case. – Mike Nakis Dec 24 '11 at 18:43
  • @MikeNakis: Yes you can solve all these problems but at the cost of added complexity for all those solutions (like serialization) and it's easy to get it wrong. The potential performance gains (which I doubt are that large) do not outweigh all the potential problems you will by into. – ChrisWue Dec 24 '11 at 18:58
0

9 years after this question was asked

In November of 2020, C# version 9 was released, which introduced "native sized integers" nint and nuint that do (almost) exactly what the question is hinting at, thus proving that int and uint were originally tied to 32-bits for no good reason whatsoever.

Furthermore

In November of 2022, C# version 11 was released, where nint and nuint were made aliases of System.IntPtr and System.UIntPtr respectively, further supporting this.

Mike Nakis
  • 56,297
  • 11
  • 110
  • 142