2

I tried to calculate hashes for constant C-strings in compile-time using macros. That is my example code:

#include <stddef.h>
#include <stdint.h>

typedef uint32_t hash_t;

#define hash_cstr(s) ({           \
      typeof(sizeof(s)) i = 0;    \
      hash_t h = 5381;            \
      for (; i < sizeof(s) - 1; ) \
        h = h * 33 + s[i++];      \
      h;                          \
    })

/* tests */
#include <stdio.h>

int main() {
#define test(s) printf("The djb2 hash of " #s " is a %u\n", hash_cstr(#s))

  test(POST);
  test(/path/to/file);
  test(Content-Length);
}

Now I run GCC to show listing:

arm-none-eabi-gcc-4.8 -S -O2 -funroll-loops -o hash_test.S hash_test.c

And the result is as expected: all strings was eliminated and replaced by its hashes. But generally I use -Os to compile code of embedded apps. When I try to do it, I have hashes only for strings with less than four characters. I also tried to set parameter max-unroll-times and use GCC 4.9:

arm-none-eabi-gcc-4.9 -S -Os -funroll-loops \
  --param max-unroll-times=128 -o hash_test.S hash_test.c

I can't understand the reason of that behavior and how I can extend this restriction of four chars.

Kayo
  • 179
  • 8
  • 3
    Use a `constexpr` function rather than a macro ? – M.M Dec 17 '15 at 11:55
  • 3
    But `constexpr` is for C++, and the example looks like C code – Basile Starynkevitch Dec 17 '15 at 11:59
  • And also filename is `.c`. It's not C++. Why was I summoned here? – Ivan Aksamentov - Drop Dec 17 '15 at 12:00
  • @BasileStarynkevitch but OP explicitely said that he is using C++ by mentioning it and not C in tags, right? – Revolver_Ocelot Dec 17 '15 at 12:03
  • I guess that since the code is purely C99, the file suffix is `.c`, hence the OP wants in fact C (not C++) and might have tagged his question wrongly. – Basile Starynkevitch Dec 17 '15 at 12:05
  • Of course, I use pure C. – Kayo Dec 17 '15 at 12:08
  • The [GCC documentation on optimization flags](https://gcc.gnu.org/onlinedocs/gcc-4.9.2/gcc/Optimize-Options.html#Optimize-Options) made me believe `-Os` and `-O2` only enable/disable `-fxxx` flags. I tried to compile with all optimization flags enabled and disable them one by one until I find a suitable combination near `-Os` enough for OP. To my surprise, it didn't worked: `gcc -S -fauto-inc-dec -fcprop-registers -fdce -fdefer-pop ... -fomit-frame-pointer -funroll-loops -o hash_test.S hash_test.c && cat hash_test.S` : there is no hash :( – YSC Dec 18 '15 at 14:41
  • I implemented a preprocessing tool, because I need static const hashes in common case. Advanced compile-time evaluation is a bottleneck of C. – Kayo Dec 22 '15 at 16:33

3 Answers3

2

I suggest putting the relevant code in a separate file and compile that file with -O2 (not with -Os). Or put a function specific pragma like

 #pragma GCC optimize ("-O2")

before the function, or use a function attribute like __attribute__((optimize("02"))) (and the pure attribute probably is also relevant)

You might be interested by __builtin_constant_p.

I would make your hashing code some static inline function (perhaps with always_inline function attribute), e.g.

 static inline hash_t hashfun(const char*s) {
    hash_t h = 5381;
    for (const char* p = s; *p; p++) 
      h = h * 33 + *p;
    return h;
 }

A more portable (and less brittle) alternative is to change your build procedure to generate some C file (e.g. with a simple awk or python script, or even an ad-hoc C program) containing things like

  const char str1[]="POST";
  hash_t hash1=2089437419; // the hash code of str1

Don't forget that .c or .h files can be generated by something else (you'll just need to add some rules inside your Makefile to generate them); if your boss feels uneasy about that show him the metaprogramming wikipage.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • Yes, I use `__builtin_constant_p` in real code, but this is simplified example to test possibility of hash calculation at compile-time. – Kayo Dec 17 '15 at 12:16
  • The `static inline` function actually don't calculated at compile-time with GCC < 4.9. – Kayo Dec 17 '15 at 12:45
1

It seems I found a workaround, which is limited by length. It looks like a dirty hack but works as expected with any GCC toolchain.

#define _hash_cstr_4(s, o)                \
  for (; i < ((o + 4) < sizeof(s) - 1 ?   \
              (o + 4) : sizeof(s) - 1); ) \
    h = h * 33 + s[i++]

#define _hash_cstr_16(s, o)           \
  _hash_cstr_4(s, o);                 \
  _hash_cstr_4(s, o + 4);             \
  _hash_cstr_4(s, o + 8);             \
  _hash_cstr_4(s, o + 12)

#define _hash_cstr_64(s, o)           \
  _hash_cstr_16(s, o);                \
  _hash_cstr_16(s, o + 16);           \
  _hash_cstr_16(s, o + 32);           \
  _hash_cstr_16(s, o + 48)

#define _hash_cstr_256(s, o)          \
  _hash_cstr_64(s, o);                \
  _hash_cstr_64(s, o + 64);           \
  _hash_cstr_64(s, o + 128);          \
  _hash_cstr_64(s, o + 192)

#define hash_cstr(s) ({                  \
      typeof(sizeof(s)) i = 0;           \
      hash_t h = 5381;                   \
      if (sizeof(s) - 1 < 256) {         \
        _hash_cstr_256(s, 0);            \
      } else                             \
        for (; i < sizeof(s) - 1; )      \
          h = h * 33 + s[i++];           \
      h;                                 \
    })

When the length of hashed string is lesser than 256 characters, it calculates hash at compile time, otherwise it calculates hash at runtime.

This solution does not require additional tuning of compiler. It works with -Os and -O1 too.

Kayo
  • 179
  • 8
0

If C++ is alowed give a channce to template function, something like:

template<int I>
  hash_t hash_rec(const char* str, hash_t h) {
  if( I > 0 ) {
    return hash_rec<I-1>(str, h * 33 + str[I-1]);
  } else {
    return h;
  }
}

#define hash(str) hash_rec<sizeof(str)>(str, 5381)

h = hash(str);
ufok
  • 193
  • 4