Micro-optimizations: using intptr_t for flag/bool types

Question

From what I understand, the definition of intptr_t varies by architecture -- it is guaranteed to have the capacity to represent a pointer that can access all of the uniform address space of a process.

Nginx (popular open source web-server) defines a type that is used as a flag(boolean) and this a typedef to intptr_t. Now using the x86-64 architecture as an example -- which has access to a plethora of instructions covering operands of all sizes -- why define the flag to be intptr_t ? Surely the tradition of using a 32-bit bool type would fit the bill just as well ?

I have gone over the 32-bit Vs. 8-bit bools argument myself when I was a new developer, and the conclusion was that 32-bit bools perform better for the common case because of the intricacies of processor design. Why then do need to move to 64-bit bools ?

score 2 · Answer 1 · answered Aug 21 '10 at 13:02

The only people who really know why nginx uses intptr_t for a boolean type are the nginx developers.

As you say, 32-bit bools often perform better than 8-bit bools for the common case. I have done no benchmarking myself, but it sounds not unreasonable to me that for certain situation on x86-64 a 64-bit bool beats a 32-bit bool. For example, in the nginx source I noticed that most ngnx_flag_t's occur in structs with other (u)intptr_t typedef'ed types. A 32-bit bool might not save space here due to alignment padding.

I do find the choice for intptr_t a bit odd as it is an optional C99 type with the intent of converting to/from void *. But as far as I can see it is never used as such. Perhaps this type gives the best approximation for 'native' word sized type?

score 1 · Answer 2 · answered Dec 27 '15 at 23:25

64bit bool sounds like a terrible idea for x86-64. I'd guess that the person who wrote it was thinking about 32bit machines with 32bit pointers at the time.

Modern x86 has very good support for unaligned loads/stores, and for unpacking a byte to fill a register on the fly. If x86 is the primary target, 8bit boolean should be preferred, esp in cases where it saves bytes leading to less cache usage. In the rare case where cache is not an issue at all, 32bit is the natural size, and will maybe save an instruction in some cases where a boolean is added or multiplied directly with an int, by allowing the boolean to be used as a memory operand, instead of being loaded with a movzx.

For the usual case of test&branch on a boolean, Intel and AMD CPUs have literally zero difference in performance between 8bit and 32bit operands, whether it's in memory or a register.

Micro-optimizations: using intptr_t for flag/bool types

2 Answers2