0

I am trying to implement a concurrent program to count occurrences of bigrams of words and letters in text files. The core are the two functions that calculate the bigrams. In the main I start threads with one of the two functions. There is a main for that starts threads and pushes them in a vector.

std::vector<std::thread *> threads;
std::pair<std::string, int> current_job;


for (unsigned int i = 0; i < num_threads; i++) {
    current_job = filenames.front();// ex. <baskerville.txt, 1>
    filenames.pop();
    if (current_job.second == 0) {
        threads.push_back(new std::thread(sentence_bigrams_letters, std::cref(current_job.first), std::ref(v),
                                          std::ref(lock_chars)));
    } else if (current_job.second == 1) {
        threads.push_back(
                new std::thread(sentence_bigrams, std::cref(current_job.first), std::ref(m), std::ref(lock_words)));
    }

} 

for (unsigned int k = 0; k < threads.size(); k++) {
        threads.at(k)->join();
        delete (threads.at(k));
    }

There are two text files called baskerville.txt and dorian.txt so filenames has a total of 4 elements. The program runs but the output that prints the threads make no sense. Each thread prints the count of a particular bigram with the bigram itself, the name of file he is reading and his id. This is the output:

199 y,t .\Text\dorian.txt 4 /letters
398 y,t .\Text\dorian.txt 2 /letters
33 a, few .\Text\dorian.txt 3 /words
66 a, few .\Text\dorian.txt 5 /words

It doesn't make any sense to me so if someone has a clue about what is going on it would be really helpful.

Jacob
  • 2,212
  • 1
  • 12
  • 18
  • how big is the program? could you post all of it? – london-deveoper Jan 27 '17 at 11:29
  • the program is not so big and the other parts of the code work because the program works in the sequential version so i think the error is not there – Gianmarco Biscini Jan 27 '17 at 11:45
  • 1
    You tagged this C++11... why don't you use more C++11? Like `for(auto &thread: threads) { thread->join(); delete thread; }`. Also, you could make the vector hold the thread objects themselves instead of just the pointers, use `threads.emplace_back(stentence_bigrams_letters, ...);` – G. Sliepen Jan 27 '17 at 12:02
  • 1
    @G.Sliepen: And using the `vector` of actual thread objects, rather than pointers, also means you don't need to worry about manual memory management; you need to `join`, but you can't forget (or accidentally bypass) memory release. Using `std::move` for stuff you'll throw away anyway would also save some cycles/memory churn. – ShadowRanger Jan 27 '17 at 18:01
  • @ShadowRanger So I don't need to delete the threads afterwards if I use an array of actual threads? – Gianmarco Biscini Jan 28 '17 at 16:55
  • @GianmarcoBiscini: Well, I'd recommend a `vector`, but yeah, when you didn't call `new`, you don't call `delete` (barring weird APIs that do the `new` for you, but those aren't part of the standard lib). – ShadowRanger Jan 28 '17 at 21:58

1 Answers1

0

I solved the issue. The problem was that i am passing the string current_job.first as a reference, and the reference was being changed in the for loop iterations. So instead of:

threads.push_back(new std::thread(sentence_bigrams_letters, std::cref(current_job.first), std::ref(v),
                                      std::ref(lock_chars)));

it is:

threads.push_back(new std::thread(sentence_bigrams_letters, current_job.first, std::ref(v),
                                          std::ref(lock_chars)));

and now it works