0
static uint32_t get_num(void);

uint32_t get_count(unsigned int mask)
{
  uint8_t i = 0;
  uint32_t count = 0;

  while (i < get_num())
  {
     if (mask & (1 << i++))
       count++;
  }

  return count;
}

In this code, what would be more safe (1L << i++) or (1UL << i++) ?

Filipe Gonçalves
  • 20,783
  • 6
  • 53
  • 70
Mark
  • 6,052
  • 8
  • 61
  • 129

2 Answers2

1

An unsigned operand is a bit safer because only then is the behavior of all the shifts defined when get_num() returns the number of bits in that operand's type. If unsigned long is wider than unsigned int then UL is slightly safer than just U, but only for accommodating get_num() results that aren't valid anyway.

Safer yet, however, is this:

uint32_t get_count(uint32_t mask)
{
  uint8_t num = get_num();

  if (num == 0) return 0;

  /* mask off the bits we don't want to count */
  mask &= ~((uint32_t) 0) >> ((num < 32) ? (32 - num) : 0);

  /* count the remaining 1 bits in mask, leaving the result in mask */
  mask = (mask & 0x55555555) + ((mask & 0xaaaaaaaa) >> 1);
  mask = (mask & 0x33333333) + ((mask & 0xcccccccc) >> 2);
  mask = (mask & 0x0f0f0f0f) + ((mask & 0xf0f0f0f0) >> 4);
  mask = (mask & 0x00ff00ff) + ((mask & 0xff00ff00) >> 8);
  mask = (mask & 0x0000ffff) + ((mask & 0xffff0000) >> 16);

  return mask;
}
too honest for this site
  • 12,050
  • 4
  • 30
  • 52
John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • For using exact matching size type in the shift: +1. Better than `U`, `UL` or `L`, etc. – chux - Reinstate Monica May 14 '15 at 22:09
  • I'm afraid there are missing parentheses: `+` binds stronger than `>>` – chqrlie May 14 '15 at 23:48
  • The int-literals should be UL (L to get at least 32 bits, U as this is unsigned). Also I would limit the result, not mask in advance: remove line `mask &= ...` and change `return mask & ((1UL << num) - 1);` safes a conditional and some arithmetic ops. num should either have no stdtype (unsigned int) or one of the fast ones (let arch choose the best). – too honest for this site May 14 '15 at 23:58
  • unsigned long (`UL` for literals) is at least 32 bits, so yes, it is safe to use! If it is 64bits, the compiler can (and will) optimize it to all 32 bits. – too honest for this site May 15 '15 at 00:04
  • @Olaf, the hex literals do not need either U or L. A hex literal's value is always unsigned, and a signed or unsigned type is chosen automatically for it that is sufficiently wide to accommodate its value. `mask will be promoted, if necessary, for the bitwise `&` operations and sums, but this is value-preserving, and the result can safely be assigned back to `mask` without modification or loss. – John Bollinger May 15 '15 at 00:32
  • @Olaf, the decimal literals (and the octal literals) even less need either `U` or `L`. Nothing in the way any of them are used risks undefined behavior, nor can it produce results different from the intended ones. – John Bollinger May 15 '15 at 00:38
  • Didin't talk bout the hexlits (allthough `UL` would give the compiler a hint which may help reporting invalid usage), but the 32 (the width-adjustment line). These will give conversion warnings (if enabled which should be) as being signed by default. Regarding risks: not for this one, but is is good practice in general and once getting used to avoids to forget it in critical situations. Called defensive programming and should be used expecially in weak-typed languages like C (in other words: use all help from the tools you can get). Anyway; not worth the discussion – too honest for this site May 15 '15 at 01:11
0

If you just want to count the 1-bits in an uint and use gcc, you should have a look at the builtin functions (here: int __builtin_popcount (unsigned int x)). These can be expected to be highly optimized and even use special instructions of the CPU where available. (one could very wenn test for gcc).

However, not sure what get_num() would yield - it just seems not to depend on mask, so its output can be used to limit the result of popcount.

The following uses a loop and might be faster than a parallel-add tree on some architectures (one should profile both versions if timing is essential).

unsigned popcount(uint32_t value, unsigned width)
{
    unsigned cnt = 0;   // actual size intentionally by arch

    if ( width < 32 )
        value &= (1UL << width) - 1;  // limit to actual width

    for ( ; value ; value >>= 1 ) {
        cnt += value & 1U;    // avoids a branch
    }
    return cnt;
}

Note the width is passed to the function.

On architectures with < 32 bits (PIC, AVR, MSP430, etc.) specialized versions will gain much better results than a single version.

too honest for this site
  • 12,050
  • 4
  • 30
  • 52