CUDA has popcount intrinsics for 32-bit and 64-bit types: __popc()
and __popcll()
.
Does CUDA also have intrinsics to get the parity of 32-bit and 64-bit types? (The parity refers to whether an integer has an even or odd amount of 1-bits.)
For example, GCC has __builtin_parityl()
for 64-bit integers.
And here's a C function that does the same thing:
inline uint parity64(uint64 n){
n ^= n >> 1;
n ^= n >> 2;
n = (n & 0x1111111111111111lu) * 0x1111111111111111lu;
return (n >> 60) & 1;
}