There is a downloader application which performs different kinds of processing on the download items in multiple threads. Some threads analyze input data, some perform downloading, extraction, save state etc. Thus each type of a thread operates on certain data members and some of these threads may run simultaneously. Download item could be described like this:
class File;
class Download
{
public:
enum State
{
Parsing, Downloading, Extracting, Repairing, Finished
};
Download(const std::string &filePath): filePath(filePath) { }
void save()
{
// TODO: save data consistently
StateFile f; // state file for this download
// save general download parameters
f << filePath << state << bytesWritten << totalFiles << processedFiles;
// Now we are to save the parameters of the files which belong to this download,
// (!) but assume the downloading thread kicks in, downloads some data and
// changes the state of a file. That causes "bytesWritten", "processedFiles"
// and "state" to be different from what we have just saved.
// When we finally save the state of the files their parameters don't match
// the parameters of the download (state, bytesWritten, processedFiles).
for (File *f : files)
{
// save the file...
}
}
private:
std::string filePath;
std::atomic<State> state = Parsing;
std::atomic<int> bytesWritten = 0;
int totalFiles = 0;
std::atomic<int> processedFiles = 0;
std::mutex fileMutex;
std::vector<File*> files;
};
I wonder how to save these data consistently. For instance, the state and the number of processed files might have already been saved, and we are going to save the list of files. Meanwhile some other thread may alter the state of a file, and consequently the number of processed files or the state of the download, making saved data inconsistent.
The first idea that comes to mind is to add a single mutex for all data members and lock it every time any of them is accessed. But that would be, probably, inefficient as most time threads access different data members and saving takes place only once in a few minutes.
It seems to me such a task is rather common in multithreaded programming, so I hope experienced people could suggest a better way.