As one MPI process executes MPI_Barrier(), other processes hang

Question

I have an MPI program for having multiple processes read from a file that contains list of file names and based on the file names read - it reads the corresponding file and counts the frequency of words.

If one of the processes completes this and returns - to block executing MPI_Barrier(), the other processes also hang. On debugging, it could be seen that the readFile() function is not entered by the processes currently in process_files() Unable to figure out why this happens. Please find the code below:

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <ctype.h>
#include <string.h>
#include "hash.h"

void process_files(char*, int* , int, hashtable_t* );

void initialize_word(char *c,int size)
{
    int i;
    for(i=0;i<size;i++)
        c[i]=0;

    return;
}



char* readFilesList(MPI_File fh, char* file,int rank, int nprocs, char* block, const int overlap, int* length)
{
    char *text;
    int blockstart,blockend;

    MPI_Offset size;
    MPI_Offset blocksize;
    MPI_Offset begin;
    MPI_Offset end;
    MPI_Status status;

    MPI_File_open(MPI_COMM_WORLD,file,MPI_MODE_RDONLY,MPI_INFO_NULL,&fh);
    MPI_File_get_size(fh,&size);

    /*Block size calculation*/
    blocksize = size/nprocs;
    begin = rank*blocksize;
    end = begin+blocksize-1;

    end+=overlap;

    if(rank==nprocs-1)
        end = size;

    blocksize = end-begin+1;

    text = (char*)malloc((blocksize+1)*sizeof(char));
    MPI_File_read_at_all(fh,begin,text,blocksize,MPI_CHAR, &status);
    text[blocksize+1]=0;

    blockstart = 0;
    blockend = blocksize;

    if(rank!=0)
    {
        while(text[blockstart]!='\n' && blockstart!=blockend) blockstart++;
        blockstart++;
    }

    if(rank!=nprocs-1)
    {

        blockend-=overlap;
        while(text[blockend]!='\n'&& blockend!=blocksize) blockend++;
    }



    blocksize = blockend-blockstart;

    block = (char*)malloc((blocksize+1)*sizeof(char));
    block = memcpy(block, text + blockstart, blocksize);
    block[blocksize]=0;
    *length = strlen(block);

    MPI_File_close(&fh);
    return block;
}

void calculate_term_frequencies(char* file, char* text, hashtable_t *hashtable,int rank)
{
    printf("Start File %s, rank %d \n\n ",file,rank);
    fflush(stdout);
    if(strlen(text)!=0||strlen(file)!=0)
    {

        int i,j;
        char w[100];
        i=0,j=0;
        while(text[i]!=0)
        {
            if((text[i]>=65&&text[i]<=90)||(text[i]>=97&&text[i]<=122))
            {
                w[j]=text[i];
                j++; i++;
            }

            else
            {

                w[j] = 0;
                if(j!=0)
                {
                    //ht_set( hashtable, strcat(strcat(w,"#"),file),1);
                }
                j=0;
                i++;
                initialize_word(w,100);
            }

        }
    }
    return;
}

void readFile(char* filename, hashtable_t *hashtable,int rank)
{
    MPI_Status stat;
    MPI_Offset size;
    MPI_File fx;
    char* textFromFile=0;

    printf("Start File %d, rank %d \n\n ",strlen(filename),rank);
    fflush(stdout);

    if(strlen(filename)!=0)
    {
        MPI_File_open(MPI_COMM_WORLD,filename,MPI_MODE_RDONLY,MPI_INFO_NULL,&fx);
        MPI_File_get_size(fx,&size);

        printf("Start File %s, rank %d \n\n ",filename,rank);
        fflush(stdout);

        textFromFile = (char*)malloc((size+1)*sizeof(char));
        MPI_File_read_at_all(fx,0,textFromFile,size,MPI_CHAR, &stat);
        textFromFile[size]=0;
        calculate_term_frequencies(filename, textFromFile, hashtable,rank);

        MPI_File_close(&fx);

    }

    printf("Done File %s, rank %d \n\n ",filename,rank);
    fflush(stdout);
    return;   
}

void process_files(char* block, int* length, int rank,hashtable_t *hashtable)
{

    char s[2];
    s[0] = '\n';
    s[1] = 0;

    char *file;
    if(*length!=0)
    {
        /* get the first file */
        file = strtok(block, s);

        /* walk through other tokens */
        while( file != NULL ) 
        {
            readFile(file,hashtable,rank);
            file = strtok(NULL, s);
        }
    }
    return;
}

void execute_process(MPI_File fh, char* file, int rank, int nprocs, char* block, const int overlap, int * length, hashtable_t *hashtable)
{

    block = readFilesList(fh,file,rank,nprocs,block,overlap,length);
    process_files(block,length,rank,hashtable);
}


int main(int argc, char *argv[]){

    /*Initialization*/
    MPI_Init(&argc, &argv);
    MPI_File fh=0;
    int rank,nprocs,namelen;
    char *block=0;
    const int overlap = 70;
    char* file = "filepaths.txt";
    int *length = (int*)malloc(sizeof(int));

    hashtable_t *hashtable = ht_create( 65536 );

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &nprocs);

    char processor_name[MPI_MAX_PROCESSOR_NAME];
    MPI_Get_processor_name(processor_name, &namelen);
    printf("Rank %d is on processor %s\n",rank,processor_name);
    fflush(stdout);

    execute_process(fh,file,rank,nprocs,block,overlap,length,hashtable);

    printf("Rank %d returned after processing\n",rank);
    MPI_Barrier(MPI_COMM_WORLD);

    MPI_Finalize();
    return 0;

}

The filepaths.txt is a file that contain the absolute file names of normal text files:

eg:

/home/mpiuser/mpi/MPI_Codes/code/test1.txt
/home/mpiuser/mpi/MPI_Codes/code/test2.txt
/home/mpiuser/mpi/MPI_Codes/code/test3.txt

This readFilesList looks rather complicated, are you sure, it is producing correct block sizes there? I don't think you will gain a lot from parallelizing this part of your code. Reading a single text file (which supposedly is relatively small in comparison to the actual data, you want to read from these files) is easier to do on a single process and might even be faster. So I would read that list on a single process and broadcast or scatter the resulting list of files. — haraldkl, Sep 06 '15 at 08:59
It appears to me, that subsequently you go on having each process reading one of the files, and not all reading parts of all. If this is the case, you are not to use MPI_IO here! The MPI_read_all operation requires all processes to participate in the call for this file. — haraldkl, Sep 06 '15 at 09:03
The readFilesList are producing non-overlapping chunks of the file lists. However, I would try the suggestion of single process to read this and use scatter to assign those to the processes. — Dhanashree, Sep 06 '15 at 12:52

haraldkl · Answer 1 · 2015-09-06T11:19:40.293

Your readFilesList function is pretty confusing, and I believe it doesn't do what you want it to do, but maybe I just do not understand it correctly. I believe it is supposed to collect a bunch of filenames out of the list file for each process. A different set for each process. It does not do that, but this is not the problem, even if this would do what you want it to, the subsequent MPI IO would not work.

When reading files, you use MPI_File_read_all with MPI_COMM_WORLD as communicator. This requires all processes to participate in reading this file. Now, if each process should read a different file, this obviously is not going to work.

So there are several issues with your implementation, though I can not really explain your described behavior, I would rather first start off and try to fix them, before debugging in detail, what might go wrong.

I am under the impression, you want to have an algorithm along these lines:

Read a list of file names
Distribute that list of files equally to all processes
Have each process work on its own set of files
Do something with the data from this processing

And I would suggest to try this with the following approach:

Read the list on a single process (no MPI IO)
Scatter the list of files to all processes, such that all get around the same amount of work
Have each process work on its list of files independently and in serial (serial file access and processing)
Some data reduction with MPI, as needed

I believe, this would be the best (easiest and fastest) strategy in your scenario. Note, that no MPI IO is involved here at all. I don't think doing some complicated distributed reading of the file list in the first step would result in any advantage here, and in the actual processing it would actually be harmful. The more independent your processes are, the better your scalability usually.

I guess I had the point of MPI-IO misunderstood to some extent and used it for totally independent file processing. I changed the part of the code that uses MPI-IO in the readFiles() in to serial file access and it worked. Thank you for the detailed explanation. — Dhanashree, Sep 06 '15 at 12:54
An MPi-IO approach can still work with MPI_COMM_SELF. You no longer get collective i/o optimizations but you do get an abstraction layer between your program and the underlying file system -- perhaps you wish to port between windows and unix, for example. — Rob Latham, Sep 09 '15 at 19:07

As one MPI process executes MPI_Barrier(), other processes hang

1 Answers1