Given a string, and a fixed length l, how can I count the number of distinct substrings whose length is l? The size of character set is also known. (denote it as s) For example, given a string "PccjcjcZ", s = 4, l = 3, then there are 5 distinct substrings: “Pcc”; “ccj”; “cjc”; “jcj”; “jcZ”
I try to use hash table, but the speed is still slow. In fact I don't know how to use the character size. I have done things like this
int diffPatterns(const string& src, int len, int setSize) {
int cnt = 0;
node* table[1 << 15];
int tableSize = 1 << 15;
for (int i = 0; i < tableSize; ++i) {
table[i] = NULL;
}
unsigned int hashValue = 0;
int end = (int)src.size() - len;
for (int i = 0; i <= end; ++i) {
hashValue = hashF(src, i, len);
if (table[hashValue] == NULL) {
table[hashValue] = new node(i);
cnt ++;
} else {
if (!compList(src, i, table[hashValue], len)) {
cnt ++;
};
}
}
for (int i = 0; i < tableSize; ++i) {
deleteList(table[i]);
}
return cnt;
}