0

I'm trying to count the frequency of letters inside a string array and setting the frequencies to an array of size of the entire alphabet. I hope I've designed the way so upper/lower cases don't matter. After this, I want to set the letter of highest frequency as the 'e' of that alphabet (since e occurs with the most frequency in many languages) and find the difference between the most frequent letter and e. It seems to make sense in my mental walkthrough but my compiler for some reason gives me breakpoint and doesn't allow me to check it at all, so I'm not sure what's wrong. So please forgive me for not posting an SSCCE. Thanks in advance for helping!

#include <iostream>
#include <fstream> 

using namespace std;

int main()
{
    int alpharay[26]; 
    for (int i = 0; i < 26; i++) 
    {
        alpharay[i] = 0;
    }
    ifstream input; 
    cout << "File name (.txt): ";
    string fileName;
    cin >> fileName;
    input.open(fileName.c_str()); 
    while (!input.eof())
    {
        string newLine;
        getline (input, newLine); 
        for (int i = 0; i < newLine.length(); i++)
        {
            if (isalpha(newLine[i]))
            {
                int index;
                if (isupper(newLine[i]))
                {
                    index = newLine[i] - 'A';
                    alpharay[index]++; 
                }
                else if (islower (newLine[i]))
                {
                    index = newLine[i] - 'a'; 
                    alpharay[index]++; 
                }

            }

        }
    }
    //To find the largest value in array
    int largest = 0;
    char popular;
    for (int i = 0; i < 26; i++)
    {
        if (alpharay[i]>=largest)
        {
            largest = alpharay[i]; 
            popular = 'a' + i; 
        }
    }
    //To find the size of the shift
    int shift = popular - 'e';
    cout << "Shift size: " << shift << endl;
    return 0;
}
shoestringfries
  • 279
  • 4
  • 18

2 Answers2

1

Problem 1:

input.open(fileName.c_str()); 
while (!input.eof())

Need a check to see if the file opened at all. If the file does not open, you will never get an EOF.

input.open(fileName.c_str()); 
if (input.is_open()
{
    while (!input.eof())
    // rest of your code
}
else
{
    cout << "Couldn't open file " << fileName << endl;
}

But this only bandages the problem. There is a lot more that can happen to a file than just EOF that you need to watch out for.

Problem 2:

while (!input.eof())
{
    string newLine;
    getline (input, newLine); 
    for (int i = 0; i < newLine.length(); i++)

So what if getline read the EOF? The program processes it as it would a valid line and then tests for EOF. Again, a simple fix:

string newLine;
while (getline (input, newLine))
{
    for (int i = 0; i < newLine.length(); i++)
    // rest of loop code
}

As long as line was read, keep going. If no line, regardless of why, the loop exits.

Problem 3:

If there are no alpha characters, this loop will return 'z':

for (int i = 0; i < 26; i++)
{
    if (alpharay[i]>=largest)
    {
        largest = alpharay[i]; 
        popular = 'a' + i; 
    }
}

Simple solution is to run the loop as it is, and then test for largest == 0 and print a suitable "No letters found" message.

user4581301
  • 33,082
  • 7
  • 33
  • 54
  • thanks for the detailed debugging. Why is it that the Problem 3 loop returns z? I specifically put it inside the main isalpha loop to prevent miscounting, please explain. – shoestringfries May 09 '15 at 06:01
  • If there are no letters in the file, all `alpharay[i]` will be 0. Largest is initialized to 0. Every `alpharay[i]>=largest` will be `0>=0` which always succeeds. The last iteration of the loop will set `popular = 'a' + 25`. z. – user4581301 May 09 '15 at 06:27
0

In C++ we should not use C-Style arrays, but C++ STL container. And there are many containers available for all kind of purposes.

For example for counting elements.

There is a more or less a standard approach for counting something in a container or in general.

We can use an associative container like a std::map or a std::unordered_map. And here we associate a "key", in this case the letter to count, with a value, in this case the count of the specific letter.

And luckily the maps have a very nice index operator[]. This will look for the given key and if found, return a reference to the value. If not found, the it will create a new entry with the key and return a reference to the new entry. So, in bot cases, we will get a reference to the value used for counting. And then we can simply write:

std::unordered_map<char,int> counter{};
counter[c]++;

And that looks really intuitive.

Additionally. Getting the biggest counter value out of a map, can simply be achieved by using a maxheap. A maxheap can be implemented in C++ with a std::priority_queue. You can use its range constructor, to fill it with the values from the std::unordered_map. So, a typical one-liner. And now you can immediately access the top most value.

With that, we can get a very compact piece of code.

#include <iostream>
#include <fstream>
#include <utility>
#include <unordered_map>
#include <queue>
#include <vector>
#include <iterator>
#include <string>
#include <cctype>

// Some Alias names to ease up typing work and to make code more readable
using Counter = std::unordered_map<char, int>;
struct Comp { bool operator ()(const std::pair<char, int>& p1, const std::pair<char, int>& p2) { return p1.second < p2.second; }};
using MaxHeap = std::priority_queue<std::pair<char, int>, std::vector<std::pair<char, int>>, Comp>;

int main() {

    // Get filename, open file and check, if it could be opened
    if (std::string fileName{}; std::getline(std::cin, fileName)) {
        if (std::ifstream fileStream{ fileName }; fileStream) {

            Counter counter{};

            // Read all characters from the source file and count their occurence
            for (char c{}; fileStream >> c;) {

                // Get lower case of letter
                const char letter = static_cast<char>(std::tolower(c));

                // Count occurence, if letter is an alpha value
                if (std::isalpha(letter)) counter[letter]++;
            }
            // Build a Max-Heap
            MaxHeap maxHeap(counter.begin(), counter.end());

            // Show result
            std::cout << "\nShift size: " << maxHeap.top().first-'e' << '\n';
        }
        else std::cerr << "\nError: Could not open file '" << fileName << "'\n";
    }
}

To be compiled with C++17

For accessing all elements in a sorted order easily, you can also use a std::multiset instead of a std::priority queue.

And if you want to have only the n top-most elements, you can use std::partial_sort_copy in conjunction with a std::vector.

A M
  • 14,694
  • 5
  • 19
  • 44