0

I have a huge array of integers and those integers are not greater than 0xFFFF. Therefore I would like save some space and store them as unsigned short.

unsigned short addresses[50000 /* big number over here */];

Later I would use this array as follows

data[addresses[i]];

When I use only 2 bytes to store my integers, they are being promoted to either 4 or 8 bytes (depending on architecture) when used as array indices. Speed is very important to me, therefore should I rather store my integers as unsigned int to avoid wasting time on type promotion? This array may get absolutely massive and I would also like to save some space, but not at the cost of performance. What should I do?

EDIT: If I was to use address-type for my integers, which type should I use? size_t, maybe something else?

user3600124
  • 829
  • 1
  • 7
  • 19
  • Define "absolutely massive". – Passerby Nov 21 '21 at 21:08
  • @Passerby may get up to 5,000,000 of entries – user3600124 Nov 21 '21 at 21:12
  • 1
    Please unveil a little more about the usage than `data[address[i]]`. A useful answer depends heavily on the way you _use_ these indices `i`. For example, if you use small sections of `addresses` but reuse the indices `i` in these blocks heavily, you might first copy them in a smaller array of actual integers. If you only use them once, you should not copy them at all. It might even be slower, because a `short` is not on the word-boundary of your architecture. And so on. – Emmef Nov 21 '21 at 21:13
  • If you are using indcies between 0 and 0xFFFF, then use `uint16_t`, not `unsigned short` - even though they often are the same thing. – Ted Lyngmo Nov 21 '21 at 21:14
  • This isn't type promotion, and doesn't take any time (just a bit more space). It's because of the register size on the architecture your using, and the fact that some larger types are more efficient than the smaller types. – Donnie Nov 21 '21 at 21:14
  • You have to differentiate language abstractions ("type is promoted") to what really happens. `What should I do?` Follow the [rules of optimization](https://wiki.c2.com/?RulesOfOptimizationClub). Inspect the generated assembly or profile the code. "Type is promoted" happens in C++ programming language - it has no meaning in hardware, hardware uses same registers anyway. – KamilCuk Nov 21 '21 at 21:14
  • 5
    Don't ask us, benchmark ! –  Nov 21 '21 at 21:17
  • The problem about simple tests is that I am not sure how much of the code gets optimized away. Can I somehow turn of all the optimizations? Using Visual Studio – user3600124 Nov 21 '21 at 21:20
  • You want to optimize for the test to be of any value but use a benchmarking library. https://quick-bench.com/ is a popular online tool based on google benchmark. – Ted Lyngmo Nov 21 '21 at 21:23
  • 2
    *Speed is very important to me, therefore should I rather save my integers as unsigned int to avoid wasting time on type promotion?* Good question. Try both approaches (with the compiler's optimization enabled!), and either inspect the assembly or run timing profiles. The answer (using either approach) will depend greatly on your platform. For my platform, there's no difference. (I didn't try with 5,000,000 entries... there may be a difference there due to cache misses.) – Eljay Nov 21 '21 at 21:28
  • This is benchamrk question. It depends on the types implementation and even more infrustructral implemetation. Anyway, according to others, this is makes no diffrence. Therfore, you should benchmark. – kobi Nov 21 '21 at 21:39
  • 1
    [Here are some compilation results](https://godbolt.org/z/zb3837z6n), do what you want with them. – n. m. could be an AI Nov 21 '21 at 21:40
  • @YvesDaoust Some people do not have access to the plethora of systems that their code may execute on, in which case it is often wise to get access to the experience of (possibly very seasoned experts) on topics like this. (Nevertheless, benchmarking should not be avoided. :-) – Kröw Jan 25 '23 at 01:11
  • @Kröw: my comment remains, fully. Even those experts would have to benchmark on that plethora of systems. General intuition is not sufficient. –  Jan 25 '23 at 07:51
  • @YvesDaoust Your comment... remains? What do you mean? Do you mean to claim that expert intuition is not useful in this scenario? – Kröw Jan 25 '23 at 22:48
  • @YvesDaoust Perhaps I should clarify to avoid confusion in case you misunderstood, benchmarking is good, but is not a reason not to ask. – Kröw Jan 25 '23 at 22:50
  • @Kröw: yes, I claim the experts cannot predict what will be faster for this case. –  Jan 26 '23 at 08:12
  • @YvesDaoust "predict what will be faster" is very vague. Surely you don't mean that there are no experts that can provide an answer, or at the very least, useful insight, regarding this issue, as that is obviously wrong, but it is not clear precisely what you are claiming. – Kröw Jan 26 '23 at 09:36
  • @Kröw: faster means running in less time. –  Jan 26 '23 at 11:25
  • @YvesDaoust Please share your opinion on why experts cannot predict what will be faster for this case. – Kröw Jan 26 '23 at 21:57
  • @Kröw: because that depends on unknown factors such as the distribution of numbers. –  Jan 27 '23 at 08:05

1 Answers1

0

Anything to do with C style arrays usually gets compiled in to machine instructions that use the memory addressing of the architecture for which you compile, thus trying to save space on array indexes will not work.

If anything, you might break whatever optimizations your compiler might want to implement.

5 Million integer values, even on a 64bit machine, comes to about 40 MB RAM.

While I am sure your code does other things, this is not that much memory to sacrifice performance.

Since you chose to keep all those values in RAM in the first place, presumably for speed, don't ruin it.

Lev M.
  • 6,088
  • 1
  • 10
  • 23
  • 1
    " thus trying to save space on array indexes will not work." The op has all the indexes saved in an array called `addresses`. He is worried about the promotion of `addresses[i]` from `unsigned short` to `size_t` when doing `data[addresses[i]]`. – bolov Nov 21 '21 at 21:42