Several possible optimizations:
you can trade a modulo for a multiply, usually much faster: q= a / 10; m= a - 10 * q;
you can avoid the final counting loop by packing all flags in a single integer, let mask
; initialize it with mask= 0
; every time you find a digit (m
), flag it with mask|= (1 << m)
; in the end, the count will be given by bits[mask]
, where bits
is a vector containing the precomputed counts for all integers from 0
to 1023=2^10-1
.
int distinct(long long int a)
{
int mask= 0;
while (a)
{
int q= a / 10, m= a - 10 * q;
mask|= 1 << m;
a= q;
}
static short bits[1024]= { 0, 1, 1, 2, 1, 2, 2, 3, ...}; // Number of bits set
return bits[mask];
}
Even better, you can work with digits in groups, say of three. Instead of converting to base 10, convert to base 1000. And for every base 1000 "digit", compute the corresponding mask that flags the constituent decimal digits (for instance, 535
yields the mask 1<<5 | 1<<3 | 1<<5 = 40
).
This should be about three times faster. Anyway, some care of the leading zeroes should be added, for instance by providing a distinct array of masks for the leading triple (..1
vs 001
).
int distinct(long long int a)
{
int mask= 0;
while (true)
{
int q= a / 1000, m= a - 1000 * q;
if (q == 0)
{
static short leading[1000]= { 1, 2, 4, 8, 16, 32, 64, ...}; // Mask for the leading triples
mask|= leading[m];
break;
}
else
{
static short triple[1000]= { 1, 3, 5, 9, 17, 33, 65, ...}; // Mask for the ordinary triples
mask|= triple[m];
a= q;
}
}
static short bits[1024]= { 0, 1, 1, 2, 1, 2, 2, 3, ...}; // Number of bits set
return bits[mask];
}
Use static arrays to make sure they are loaded once for all.