1

I have to read word by word from "testdata.txt"and look for the same words in another file "dictionary.txt". I already implemented the code to read the "dictionary.txt" in ReadDictionary() function. But I have to Implement the ReadTextFile() public member function in order to read the file named: “testdata.txt” into the “KnownWords” and “UnknownWords” data which are private members. Only “known” words should be put in the “KnownWords” and the rest to "UnknownWords". I have to use map and pair but i have no idea how to use it in my programming. Can someone please help me figure this one out in order to get this output:

89 known words read.
49 unknown words read.

int main():

WordStats ws;
ws.ReadTxtFile();

HeaderFile:

using namespace std;
typedef map<string, vector<int> > WordMap;     
typedef WordMap::iterator WordMapIter;        

class WordStats
{
public:
    WordStats();
    void ReadDictionary();
    void DisplayDictionary();
    void ReadTxtFile();
private:
    WordMap KnownWords;
    WordMap UnknownWords;
    set<string> Dictionary;
    char Filename[256];
};

This is my program:

WordStats::WordStats(){
strcpy(Filename,"testdata.txt");
}

// Reads dictionary.txt into Dictionary
void WordStats::ReadDictionary(){
    string word;    
    ifstream infile("dictionary.txt");
    if(!infile)
    {
        cerr << "Error Opening file 'dictionary.txt. " <<endl;
        exit(1);
    }
    while(getline(infile,word))
    {       
        transform (word.begin(), word.end(), word.begin(), ::tolower);
        Dictionary.insert(word); 
    }
    infile.close();
    cout << endl;
    cout << Dictionary.size() << " words read from dictionary. \n" <<endl;

}
// Reads textfile into KnownWords and UnknownWords
void WordStats::ReadTxtFile(){
    string words;
    vector<string> findword;
    vector<int> count;
    ifstream ifile(Filename);
    if(!ifile)
    {
        cerr << "Error Opening file 'dictionary.txt. " <<endl;
        exit(1);
    }
    while(!ifile.eof())
    {
        getline(ifile,words);
        //KnownWords.insert( pair<string,int>( KnownWords, words ) );
        findword.push_back(words);
        Paragraph = KnownWords.find(words);
        //stuck here
    }
    }
muzzi
  • 382
  • 3
  • 10
  • What `std::vector` values are you associating with your known and unknown words? – Caleth May 08 '18 at 09:44
  • as per requirement, we have to use vector to push_back the position where we find the word in the dictionary – muzzi May 08 '18 at 11:10

2 Answers2

1

First of all, you are using a wrong datatype WordMap. In my humble opinion it should be just map<string, int>, because you want to count how many times a word occurs in your text.

Secondly, you should read words from the file instead of whole lines of text. You can do it with following code:

std::string word;
while (ifile >> word) {
    if (Dictionary.find(word) != Dictionary.end()) {
        // WordMap::value_type ... creates instance of std::pair object
        auto it = KnownWords.insert(KnownWords.end(), WordMap::value_type(word, 0));
        it->second++;
    } else {
        auto it = UnknownWords.insert(UnknownWords.end(), WordMap::value_type(word, 0));
        it->second++;
    }
}
zdenek
  • 21,428
  • 1
  • 12
  • 33
  • what are yo using "it" for? – muzzi May 08 '18 at 11:17
  • it gives me warning " 'it' does not name a type" I tried to put 'Paragraph' as an iterator it gave me the same error message. – muzzi May 08 '18 at 11:24
  • `it` is an iterator. And if your compiler don't support keyword `auto` you should upgrade it or enable C++11 support. – zdenek May 08 '18 at 11:56
  • I tried and now compiler is giving me this error: no matching function for call to 'std::pair, std::vector >::pair(std::string&, int)' – muzzi May 08 '18 at 18:39
1

You seem to need to inspect Dictionary to see if it contains each word you read, then choose which of KnownWords and UnknownWords to modify.

void WordStats::ReadTxtFile(){
    std::ifstream ifile(Filename);
    if(!ifile)
    {
        std::cerr << "Error Opening file " << Filename << std::endl;
        exit(1);
    }

I have cleaned up your local declarations, so variables are alive for as little time as necessary.

Assuming that the file contains words separated by spaces and newlines, read each word

    for (std::string word; ifile >> word; )
    {

Make it lowercase

        transform (word.begin(), word.end(), word.begin(), ::tolower);

Then look to see if it is in Dictionary

        if (Dictionary.count(word))
        {

Record the position in KnownWords[word].

            KnownWords[word].push_back(ifile.tellg());
        }
        else
        {

Or in UnknownWords[word].

            UnknownWords[word].push_back(ifile.tellg()); 
        }
    }

Then display the sizes from those to get the desired output.

    std::cout << KnownWords.size() << " known words read." << std::endl;
    std::cout << UnknownWords.size() << " unknown words read." << std::endl;
}

You could replace the conditional statement that duplicates the action, with a conditional expression. Note the reference type in the declaration of Words

WordMap & Words = (Dictionary.count(word) ? KnownWords : UnknownWords);
Words[word].push_back(ifile.tellg()); 

As a complete function:

void WordStats::ReadTxtFile(){
    std::ifstream ifile(Filename);
    if(!ifile)
    {
        std::cerr << "Error Opening file " << Filename << std::endl;
        exit(1);
    }

    for (std::string word; ifile >> word; )
    {
        transform (word.begin(), word.end(), word.begin(), ::tolower);
        WordMap & Words = (Dictionary.count(word) ? KnownWords : UnknownWords);
        Words[word].push_back(ifile.tellg()); 
    }

    std::cout << KnownWords.size() << " known words read." << std::endl;
    std::cout << UnknownWords.size() << " unknown words read." << std::endl;
}
Caleth
  • 52,200
  • 2
  • 44
  • 75
  • As an aside, i have a *particular hatred* of functions like `void MyClass::doAction()`, because they don't use the type system to tell you anything about what is going on. Ask your instructor why they chose those as the members of `WordStats`, over something like `WordStats ReadFromStream(std::istream & source, std::set Dictionary)` – Caleth May 08 '18 at 11:25
  • I tried the code but its not giving me the exact output. what should i write inside count() in cout << KnownWords.count() and cout<< UnknownWords.count().. as compiler is giving me this error ` [Error] no matching function for call to 'std::map, std::vector >::count()'`. – muzzi May 08 '18 at 18:45
  • the output is zero for both if i put "word" in it like KnownWords.count(word) or UnknownWords.count(word). – muzzi May 08 '18 at 19:01
  • oops, I meant 'size' – Caleth May 08 '18 at 21:33
  • and how to print out the position of the KnownWord and UnkownWord ? – muzzi May 09 '18 at 01:35