As a general proposition, using size_t
and ptrdiff_t
is vastly preferred over using, say, plain unsigned int
and int
. size_t
and ptrdiff_t
are pretty much the only way of writing a robust and widely portable program.
However: there is no such thing as a free lunch. Properly using size_t
takes some work, too -- it's just that, if you know what you're doing, it takes less work than trying to achieve the same result without using size_t
.
Also, size_t
has the problem that you can't print it using %d
or %u
. Ideally you want to use %zu
, but, tragically, not all implementations have supported it.
If you have a large and badly written program that doesn't use size_t
, it's probably full of bugs. Some of those bugs will have been masked or worked around. If you try to change it to use size_t
, a certain number of the program's workarounds will fail, perhaps uncovering once-hidden bugs. Eventually you'll work those out and achieve the more-robust and more-reliable and more-portable program you desire, but the process will be a rocky one. I suspect that's what the author means by "it is most likely that due to this replacement, new errors will appear".
Changing a program over to use size_t
is sort of like trying to add const
in all the right places. You make the changes you think you need to make, and recompile, and you get a bunch of errors and warnings, and you fix those and recompile, and you get a bunch more errors and warnings, etc. It's at least a nuisance, and sometimes a ton of work. But it's generally the only way to go if you want to make the code more robust and portable.
A big part of the problem is keeping the compiler happy. It's going to warn about a bunch of stuff, and you'll generally want to fix everything it complains about, even though some of what it complains about is ticky-tack and unlikely to cause a problem. But it's perilous to say, "Yeah, I can ignore this particular warning", so in the end, as I said, you'll generally want to fix everything.
The author's most eye-catching claim is
memory size needed for the program will greatly increase as well.
I suspect this is an exaggeration -- in most cases I doubt that memory will "greatly" increase -- but it's likely to increase at least a little bit. The issue is that on a 64-bit system, size_t
and ptrdiff_t
are likely to be 64-bit types. If for whatever reason you have large arrays of these, or large arrays of structures containing these, and if you had been using some 32-bit type (perhaps plain int
or unsigned int
) before, yes, you're going to see a memory increase.
And then you're going to want to ask, Do I really need to be able to describe 64-bit sizes? 64-bit programming gives you two things: (a) the ability to address more than 4Gb of memory, and (b) the ability to have a single object greater than 4Gb. If you want to have a total data usage greater than 4Gb, but you don't ever need to have a single object bigger than 4Gb, and if you never want to read more than 4Gb of data at a time from a file (using a single read
or fread
call, that is), you don't really need 64-bit size variables everywhere.
So to avoid bloat, you might make an informed choice to use, say, unsigned int
(or even unsigned short
) instead of size_t
in some places. As a trivial example, if you had
size_t x = sizeof(int);
printf("%zu\n", x);
you could change this to
unsigned int x = sizeof(int);
printf("%u\n", x);
without any loss in portability, because I can quite confidently guarantee your code is never going to find itself running on a machine with 34359738368-bit int
s (or at least, not in our lifetimes :-) ).
But this last example, trivial as it is, also illustrates the other issues that tend to intrude. The similar code
unsigned int x = sizeof(y);
printf("%u\n", x);
is not so obviously safe, because whatever y
is, there's a chance it could be so big that its size doesn't fit in an unsigned int. So if you or your compiler really care about type correctness, there may be warnings about possible data loss when assigning size_t
to unsigned int
. And to shut off those warnings, you may need explicit casts, as in
unsigned int x = (unsigned int)sizeof(int);
And this cast is, arguably, perfectly appropriate. The compiler is operating under the assumption that any object might be really big, that any attempt to jam a size_t
into an unsigned int
might lose data. The cast says you've thought about this case: you're saying, "Yes, I know that, but in this case, I know it won't overflow, so please don't warn me about this one any more, but please do warn me about any others, that might not be so safe."
P.S. I'm being downvoted, so in case I've given the wrong impression, let me make clear that (as I said in my opening paragraph) size_t
and ptrdiff_t
are vastly preferred. In general there's every reason to use them, no good reason not to use them. (Come to that, Karpov wasn't saying not to use them, either -- merely highlighting some of the issues that might come up along the way.)