0

I'm new to C++ and to threads too. So I'm pretty confused.

I'm trying to write a class that wraps a generic queue and provides two methods: push and pop. (thread safe)

The push method will use a lock_guard and push the elem received as param into the queue. The pop method will wait until there's at least one elem to be read from the queue without consuming CPU (I used condition_variable and the "wait" method on it).

I think I got no problem with the implementation of the class itself. But I got a problem using this class with threads.

I use two threads("producer" and "consumer"):

-"producer" reads a .txt file (that contains one string per line) and calls the push method for each string.

-"consumer" instead calls the pop method and store the result into a string vector.

I don't know how to manage the termination of the consumer loop in which it calls the pop method of the object bf. The temporary solution I've adopted is about pushing into the queue the string "EOF" to warn the consumer thread that producer has finished his task but I don't like it.

I tought about defining a boolean flag (initialized to false) into the Buffer class (protected by a mutex) and a method to set it to true (we can call it "setFlagTrue").

The producer will call the setFlagTrue method at when has finished his loop. The consumer instead will check somehow the flag to stop the iteration.

I'm confused about its correctness because I want to be sure that these situations won't happen at 100%:

  1. The producer pushes every string into the queue and sets flag to true. The consumer is so fast that pops every single string from the queue before the flag has been swapped to true and so it starts waiting again for elems to be popped out without be notified anymore.
  2. The consumer reads the flag value true but he has not finished to pop strings yet. (I think I can solve this checking the flag and if there are still strings into the queue at the same time. Am I right?)
  3. Any other possible problem about relative speeds of "producer" and "consumer" threads.

There's the code I wrote:

#include <iostream>
#include <mutex>
#include <fstream>
#include <condition_variable>
#include <queue>
#include <string>
#include <thread>
#include <vector>

template <typename T>
class Buffer{

private:
    std::mutex m;
    std::condition_variable cv;
    std::queue<T> buffer;

public:
    void push(T elem){
        std::lock_guard lg(m);
        buffer.push(elem);
        cv.notify_one();
    }

    T pop(){
        T elemToReturn;
        std::unique_lock ul(m);
        cv.wait(ul, [this](){return !this->buffer.empty();});
        elemToReturn = buffer.front();
        buffer.pop();
        ul.unlock();
        return elemToReturn;
    }
};

int main() {
    Buffer<std::string> bf;
    std::string filename("../file.txt");
    std::vector<std::string> results;

    std::thread producer ([&bf, filename](){
        std::ifstream inputFile(filename);
        std::string str;

        while(getline(inputFile, str)){
            bf.push(str);
        }

            bf.push("EOF");
    });

    std::thread consumer ([&bf, &results](){
        std::string elem;
        for(elem=bf.pop(); elem!="EOF"; elem=bf.pop()){
            results.push_back(elem);
        }
    });

    producer.join();
    consumer.join();

    for(auto &elem: results){
       std::cout<<elem<<std::endl;
    }

    return 0;
}

If you can explain me how to avoid these problems and post your solution code I'll be eternally grateful to you. And give me a precise definition of thread safe please.

Thank you for your time.

Zanarkand
  • 3
  • 8
  • 1
    Just a suggestion: How do you know that you are at the end of a file? A similar interface can be implemented for the consumer side. Further, how do you signal that you are done with writing to a file? A similar interface can be implemented for the producer side. That said, writing a so-called signal value ("EOF" string) is a common approach. – Ulrich Eckhardt Jun 30 '20 at 20:07
  • If you need a way to shut down the Buffer, then the class should implement a way to shut down the buffer. That can mean a "signalEOF" function that sets a flag and signals the condition variable and a way for the `pop` function to indicate that the buffer has been shut down. – David Schwartz Jun 30 '20 at 20:11
  • May I recommend a book called "concurrency in action", it's great for learning thread safety. – rustyx Jun 30 '20 at 20:36
  • @DavidSchwartz So in practice I should define a method that set the flag and calls the cv.notify_one()? That's all? – Zanarkand Jun 30 '20 at 21:50
  • @rustyx Thank you. I'll read it for sure but I'm gonna try to sustain an exam in 10 days. I don't think I have enough time to read it all. – Zanarkand Jun 30 '20 at 21:51
  • @UlrichEckhardt it's a common approach but is it the most correct one? Like the approach explained by David beneath your comment: is it better or they are at the same level? – Zanarkand Jun 30 '20 at 21:52
  • @Zanarkand That's what I recommend. And if the queue is empty, before waiting, check if the flag is set. – David Schwartz Jun 30 '20 at 21:54

1 Answers1

0

I would suggest 2 possible simple solutions:

  1. Use std::optional<std::string> as T and push empty optional as a flag for eof. Using special string is error prone solution. This method requires smallest changes in your code.

  2. Add a boolean flag into Buffer and check it in pop after you make sure that buffer is empty. This requires more code change - your predicate needs to be extended and check for the flag as well. Also you need to decide what to return to caller to notify the end condition (or through exception). Of course you need to set the flag under mutex locked and call notify.

So I would say method 1 would be much simpler and easy to understand ie more readable and it would be not less efficient if you add move semantics to your code:

void push(T elem){
    std::lock_guard lg(m);
    if( buffer.empty() )
        cv.notify_one();
    buffer.push( std::move(elem) );
}

T pop(){
    std::unique_lock ul(m);
    cv.wait(ul, [this](){return !this->buffer.empty();});
    auto elemToReturn = std::move( buffer.front() );
    buffer.pop();
    return elemToReturn;
}
Slava
  • 43,454
  • 1
  • 47
  • 90
  • Thank you so much for your answer. – Zanarkand Jul 02 '20 at 18:13
  • I've used the first suggestion. Now there's a problem: I need to implement the same functionality but with N producer. I think the first method it's kinda difficult to use. So I prefer using a flag that will be increased every time a producer thread has finished his job. My question now is: should this int flag be wrapped into an std::atomic and why? Or is enough using a int variable protected with a mutex? – Zanarkand Jul 02 '20 at 18:16