I have mmapped a huge file into char string and made a c++ string out of it. I need to parse this string based on a delimit character which is a space character and store the values in matrix. I could do it from one thread but I need to optimize it. So I'm using multiple threads to parse strings from this sstream and store it in matrix . Though based on thread id, I could store the parsed data into matrix synchronously but How do i synchronize the parsing since any thread can get scheduled anytime and parse string. Here is my code
void* parseMappedString(void* args)
{
char temp[BUFFSIZE];
long int threadID = *((long int*)args);
if (threadID < 0)
threadID = 0;
for (int i = ((threadID) * 160); i < ((threadID+1) * 160); i++)
{
for (int j = 0; j < 4000; j++)
{
pthread_mutex_lock(&ParseMatrixMutex);
if ((matrix_str.getline(temp,BUFFSIZE, ' ')) )
{
pthread_mutex_unlock(&ParseMatrixMutex);
matrix[i][j] = parseFloat((temp));
}
else
{
pthread_mutex_unlock(&ParseMatrixMutex);
}
}
}
}
void create_threads_for_parsing(void)
{
long int i;
for (i = 0; i < 5; i++)
pthread_create(&Threads[i], NULL, parseMappedString, (void*)&i);
}
In the code if you see that there are total five threads and each thread is processing 160 * 4000 elements. And they are storing based on their thread id hence into unique location in matrix. so that way it is synchronized. But getline can be done by any thread at any time hence thread no 5 can parse data which belongs to first thread. How do i avoid this ?
I had to following because I receive 1-4 threadids in args but never 0. It is always coming as some junk negative value hence I had to hardcode it like this.
if (threadID < 0) threadID = 0;