How could I do frequency analysis on a string without using a switch

Question

I am working a school project to implement a Huffman code on text. The first part of course requires a frequency analysis on the text. Is there a better way aside from a giant switch and an array of counters to do it?

ie:

int[] counters

for(int i = 0; i <inString.length(); i++)
{
switch(inString[i])
    case 'A':
    counters[0]++;
.
.
.

I would like to do all alpha-numeric characters and punctuation. I am using c++.

score 8 · Accepted Answer · answered Feb 28 '10 at 04:14

8

Why not:

int counters[256] = {0};
for(int i = 0; i <inString.length(); i++)
    counters[inString[i]]++;
}


std::cout << "Count occurences of \'a\'" << counters['a'] << std::endl;

answered Feb 28 '10 at 04:14

Alexander Gessler

45,603
7
82
122

score 6 · Answer 2 · answered Feb 28 '10 at 04:14

6

You can use an array indexed by character:

int counters[256];
for (int i = 0; i < inString.length(); i++) {
    counters[(unsigned char)inString[i]]++;
}

You will also want to initialise your counters array to zero, of course.

answered Feb 28 '10 at 04:14

Greg Hewgill

951,095
183
1,149
1,285

And for those of us playing the optimization game at home for fun, `for (int i = inString.length()-1; i >= 0 ; i--)` instead. – Amber Feb 28 '10 at 04:16
1

@Dav:if you want to optimize, lift the call to `inString.length()` out of the loop instead. Counting backwards is more often counterproductive, simply because your cache may not expect that -- and a single cache miss will cost more than a lot of comparisons. – Jerry Coffin Feb 28 '10 at 05:45
It's more the fact that moving it from the conditional to the initializer results in fewer function calls to `.length()`. But yes, moving it out of the loop also works fine. – Amber Feb 28 '10 at 06:51
I usually write that as `for (int i = 0, imax = inString.length(); i < imax; i++)`. – Roland Illig Jun 13 '10 at 09:29

score 2 · Answer 3 · answered Feb 28 '10 at 04:38

2

using a map seems completely applicable:

map<char,int> chcount;
for(int i=0; i<inString.length(); i++){
  t=inString[i];
  chcount[i]? chcount[i]++ : chcount[i]=1;
}

answered Feb 28 '10 at 04:38

dagoof

1,137
11
14

1

This is particularly true if you venture beyond the world of nationalized character sets into the big, wide world of Unicode. – Jerry Coffin Feb 28 '10 at 05:46

How could I do frequency analysis on a string without using a switch

3 Answers3