0

I have a critical section of code which examines each char in many strings to ensure it falls in an acceptable range.

Is there any way i can perform such filtering without branching?

...
int i, c;
int sl = strnlen(s, 1023);
for( i = 0; i < sl; i++ ) {
    c = s[i];
    if( c < 68 || c > 88 )
        return E_INVALID;
}
if( 0 == i )
    return E_INVALID;
... do something with s ...

I was thinking some kind of filtering using bitwise operations might be possible, but in practice i can't see how to make this work. Bitwise AND with 95 trims the range down to 0-31,64-95. i can't see how to progress without introducing an if test, rendering the idea of skipping the branching void.

CraigJPerry
  • 973
  • 1
  • 8
  • 16
  • What about using regex? http://stackoverflow.com/questions/1085083/regular-expressions-in-c-examples It works only in a POSIX system however. – Zagorax Oct 10 '12 at 19:24

2 Answers2

1

Assuming your strings are really unsigned chars, not ints, you could have a 256 byte lookup table of unacceptable characters, which would make your test if(table[s[i]]) { return E_INVALID; }

However, if you are trying to speed up a critical function, you should do other things for much bigger payoff. To start, you can skip the strnlen entirely, and terminate the loop on a 0 char. That alone will probably get you a factor of 2. Next unroll the loop by a factor of 10 or so, which ought to get another factor of 2.

ddyer
  • 1,792
  • 19
  • 26
  • The lookup table idea is really useful, the code as presented isn't exactly as used, i had to condense for clarity, however this idea fits in brilliantly. – CraigJPerry Oct 11 '12 at 19:31
1

It is possible to filter using bitwise operations. Try...

c & 68 & ~88;

This should always return zero for values outsize the boundary and a non-zero value for values inside your boundary.

The order is necessary too...

CHAR & LowerBound & ~UpperBound

Flipping the boundaries would result in wrong behaviours

Igbanam
  • 5,904
  • 5
  • 44
  • 68
  • I can't seem to replicate this: unsigned char t; for( t=0; t<255; t++ ) { printf( "%d = %d\n", t, t & 68 & ~88 ); } Results in: 0 = 0 1 = 0 2 = 0 3 = 0 4 = 4 5 = 4 6 = 4 7 = 4 8 = 0 – CraigJPerry Oct 11 '12 at 19:27
  • It worked for select cases from 50 - 100 when I did it. Let me look through and see what's causing issues – Igbanam Oct 11 '12 at 19:39
  • because the bits are not sequential, different numbers may have ones that overlap the required segment. I still believe this could be done but the amount of time it takes to come up with a solution is, at the moment, boundless. – Igbanam Oct 12 '12 at 18:23