0

I have this problem that I'm studying but I'm not understanding one part. The script is not in English so translating would be pretty tedious but the basic problem is to make a thread read a specific text file and find a specific word. Each file has its own thread and all that. The last 2 problems are making sure that various occurrences on the same file are printed together, like:

file1: line 1
file1: line 2
file2: line 1

and so on. I could solve this using a global 2d array and creating a structure to pass to the thread its "id" and the name of the txt file it has to search. I used pthread_join and it´s pretty intuitive. The problem is in the next problem, solve the same problem but without pthread_join, and with no busy waiting if possible. The problem is, if I don't use pthread_join, I can't print anything on the thread function and I wasn't expecting this? Is there a reason why this happens?

This is the code I used for solving the problem with pthread_join. Without pthread_join, using a mutex and trying to print the output on the thread function, I got no output.

#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include<string.h>
#include<errno.h>
#include<fcntl.h>
#include <pthread.h>
#define k 4
#define l 100

int match_line(int fd, char *str);
void *ocorre(void *);

char string[100];
int b[k][l];
int max;

struct args{
    char str[256];
    int id;
};
int main(int argc, char *argv[]){

    int i=0;
    int j=0;
    max=argc-1;

    struct args arg[max];
    pthread_t a[max];

    strcpy(string,argv[1]); //global

    for(i=0;i<max-1;i++){   //criaçao de threads
        arg[i].id=i;
        strcpy(arg[i].str,argv[i+2]);
        pthread_create(&a[i],NULL,ocorre,&arg[i]);
    }

    for(i=0;i<max-1;i++){ //join
        pthread_join(a[i],NULL);
            for(j=0;b[i][j]!=0;j++){
                printf("%s : %d\n",arg[i].str,b[i][j]);
            }
    }
}
void *ocorre(void *arg) {

    int fd;
    int j=0;
    struct args func;
    func=*(struct args*)arg;

    fd=open(func.str,O_RDONLY);
        while(1){
        b[func.id][j]=match_line(fd,string);
            if(b[func.id][j]==0) 
                break;
        j++;
    }

    return NULL;
}

This is what i did to try to solve without pthread_join. To obtain the first output i have to add sleep(1) after creating the thread

  • We can't find the bug in code we can't see. But you can use some other synchronization mechanism (such as a mutex, condition variable, and predicate variable) to know when the work is done. – David Schwartz Apr 13 '17 at 17:57
  • i think i understand whats happening, its not a bug. If the thread is not joinable i will lose the information that was altered in the thread function? – Francisco Fonseca Apr 13 '17 at 18:01
  • No, that's not how things work at all. You just need to make sure the *work* is done before you attempt to use its results. (And using the termination of threads to indicate when work is done is a poor practice.) – David Schwartz Apr 13 '17 at 18:02
  • It is suggested to use pthread_join() – Francisco Fonseca Apr 13 '17 at 18:06
  • It is poor practice to use the termination of a thread as a way to know that work has been done. There is no reason code that wants to wait until work has completed should care what the thread that does the work goes on to do after it has finished with the work. If you want to wait until work is done, wait for the work. Maybe it was suggested to make sure you learn how `pthread_join` works. – David Schwartz Apr 13 '17 at 18:07
  • i answered with the code i used for the question that i could use pthread join. If i take the join part out , use a mutex in the thread function and try to print it, it has 0 output. – Francisco Fonseca Apr 13 '17 at 18:16
  • My first comment could be expanded into an answer. I don't have time to do that just now, but I might in a bit if nobody else does. Use some proper form of synchronization such as a semaphore or barrier. – David Schwartz Apr 13 '17 at 18:19
  • Somehow or another, each thread is going to have to save the results so that some thread (possibly the per-file thread, possibly the main thread) can write all the information for one file in one operation. You can use [`flockfile()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/flockfile.html) to ensure synchronization on the output stream. A thread locks the output stream, writes, and unlocks. You could use mutexes and condition variables per file/thread without using `pthread_join()` — but don't forget to start the threads as detached threads if you're not going to join them. – Jonathan Leffler Apr 13 '17 at 18:43
  • The way i learned was with cond. variables or mutexes. But the problem that im having is that with detached threads, i dont have any output if i try to printf something in the thread function, i dont think i can even acess it, b array is not altered at all, all 0´s when i try to print it in the main – Francisco Fonseca Apr 13 '17 at 18:48
  • okay, i was detaching it right after i created it. If i delay that action by a lil bit i get output. Could that be the problem? i had to use the sleep function between creating and detaching to have output. I dont understand why though – Francisco Fonseca Apr 13 '17 at 19:03
  • Use the second parameter of `pthread_create` to create the thread as detached, instead of creating it as joinable (the default) and then separately detaching. – bta Apr 13 '17 at 19:13
  • that way it wont print anything. the code is just the same, just using detached threads, no phtread_join and printing in the thread function instead of the main, using mutexes – Francisco Fonseca Apr 13 '17 at 19:29
  • i added a picture showing code and outputs – Francisco Fonseca Apr 13 '17 at 19:34
  • with pthread_exit in the main i have output without using sleep. Why did this happened? – Francisco Fonseca Apr 13 '17 at 19:44

1 Answers1

0

Returning from main terminates the process because it's roughly equivalent to calling exit. You have two choices:

  1. Make sure main doesn't return or otherwise end until all work is done, or

  2. Call phthread_exit in main rather than returning.

Ideally, you would create all your threads as joinable and some shutdown code, that runs only after all useful work is done, arranges for the threads to terminate and joins them.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278