4

Given two unsigned integers, what is the fastest way to count the number of matching digits in their base 4 representation?

example 1:

A= 13 = (31) in base 4

B= 15 = (33) in base 4

the number of matching digits in base 4 is 1.

example 2:

A= 163 = (223) in base 4

B= 131 = (203) in base 4

the number of matching digits in base 4 is 2.

The first step I guess is to calculate the bitwise XOR of the two integers, then we have to count number of 00 pairs ? what is the most efficient way t do that ?

note: assume that A and B have fixed number of digits in base 4, say exactly 16 digits.

Gangnus
  • 24,044
  • 16
  • 90
  • 149
mghandi
  • 275
  • 1
  • 9

2 Answers2

3

Suppose, your ints are 4-byte each. 32 bits.

The more understandable way:
Help constant array:

h[0]=3;
for (int i=1; i<7; i++){
  h[i]=h[i-1]*4;
}

Later, for the check, if c is the integer after bitwise XOR :

int count=0;
for (int i=0; i<7; i++){
  if(c&h[i]==0)count++;
}   

Other solution. Obviously, faster, but a bit less understandable:

int h[4]={1,0,0,0}

int count=0;
for (int i=0; i<15; i++){
  count+=h[c&3];
  c=c>>2;
}   

Further qickening:

count= h[c&3] + h[(c=>>2)&3] + h[(c=>>2)&3] + h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3] + h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c=>>2)&3]+ h[(c>>2)&3];

Even further:

int h[16]={2,1,1,1, 1,0,0,0, 1,0,0,0, 1,0,0,0};
count= h[c&15] + h[(c=>>4)&15] + h[(c=>>4)&15] + h[(c=>>4)&15]  + h[(c=>>4)&15] + h[(c=>>4)&15] + h[(c=>>4)&15]+ h[(c>>4)&15];

If you really need use the function so many (10^10) times, count h[256] (you already caught, how), and use:

count= h[c&255] + h[(c=>>8)&255] + h[(c=>>8)&255] + h[(c>>8)&255] ;

I think, the help array h[256*256] would be also usable yet. Then

count= h[c&255] + h[(c>>16)&(256*256-1)];

The array of 2^16 ints will be all in the processor cash (third level, though). So, the speed will be really great.

BenMorel
  • 34,448
  • 50
  • 182
  • 322
Gangnus
  • 24,044
  • 16
  • 90
  • 149
  • that is a correct solution. However I want to do this ~10^10 times. I was wondering if there exist more efficient ways for that. Thank you anyways – mghandi Feb 11 '12 at 22:18
  • The most quick solution will use a huge array, that will give the count for the every possible c. – Gangnus Feb 11 '12 at 22:30
  • I also thought using a look up table is faster, but in practice, considering current computer architectures, it looks like that it is more efficient to calculate it locally (in the cpu) without calling the main memory. – mghandi Feb 11 '12 at 22:43
  • The array of 2^16 ints will be all in the processor cash (third level, though). So, the speed will be really great. – Gangnus Feb 11 '12 at 22:48
  • You need to split those sums up a bit. At the moment, the result is undefined, because the order of evaluation is not guaranteed; `(c=>>8)&255` could potentially be evaluated *before* `c&255`. – Nick Barnes Feb 12 '12 at 00:51
  • Regarding cache latency, L3 cache is ~5 times slower than L2, and ~10 times slower than L1. So you may be better off with twice as many memory accesses on a smaller array. But the only way to know for sure is to benchmark them all. – Nick Barnes Feb 12 '12 at 00:55
  • @Nick Barnes Without any doubt. I take the text as an algoritm , not a real code. Anyway the language wasn't mentioned. – Gangnus Feb 12 '12 at 17:31
0

One solution is to use the set bits count algorithms as suggested by Oli. but to adapt it to base 4, we can do for example:

d = x^y;

d = (d | (d>>1))& 1431655765; //1431655765=(01010101010101010101010101010101) in base 2

then count the number of set bits in d. this gives the number of mismatches.

Is this the most efficient way though ?

mghandi
  • 275
  • 1
  • 9