1

I have a program which is designed to be highly parallelizable. I suspect that some processors are finishing this Python script sooner then other processors, which would explain behavior I observe upstream of this code. Is it possible that this code allows some mpi processes to finish sooner than others?

dacout = 'output_file.out'
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
nam ='lcoe.coe'
csize = 10000
with open(dacout) as f:
    for i,l in enumerate(f):
        pass
numlines = i
dakchunks = pd.read_csv(dacout,  skiprows=0, chunksize = csize, sep='there_are_no_seperators')
linespassed = 0
vals = {}
for dchunk in dakchunks:
    for line in dchunk.values:
        linespassed += 1
        if linespassed < 49 or linespassed > numlines - 50: continue
        else:
            split_line = ''.join(str(s) for s in line).split()
        if len(split_line)==2:
              if split_line[0] == 'nan' or split_line[0] == '-nan': continue

              if split_line[1] != nam: continue
              if split_line[1] not in vals:
                  try: vals[split_line[1]] = [float(split_line[0])]
                  except NameError: continue
              else:vals[split_line[1]].append(float(split_line[0]))
# Calculate mean and x s.t. Percentile_x(coe_dat)<threshold_coe
self.coe_vals = sorted(vals[nam])
self.mean_coe = np.mean(self.coe_vals)
self.p90 = np.percentile(self.coe_vals, 90)
self.p95 = np.percentile(self.coe_vals, 95)

count_vals = 0.00
for i in self.coe_vals:
    count_vals += 1
    if i > coe_threshold: break
self.perc = 100 * (count_vals/len(self.coe_vals))
if rank==0:
    print>>logf, self.rp, self.rd, self.hh, self.mean_coe
    print self.rp, self.rd, self.hh, self.mean_coe, self.p90, self.perc
kilojoules
  • 9,768
  • 18
  • 77
  • 149
  • What do you mean with behavior you observe upstream? A bug? – Chiel Apr 07 '16 at 20:14
  • Please describe more clearly the expected, desired and observed behavior and provide an [mcve]. – Zulan Apr 07 '16 at 21:26
  • 1
    In the code you posted, all processes are reading the same file and compute the same thing. But the only process printing the result is process 0. This is not parallel computing, this is doing the same thing multiple times! Some processes can finish this script before others since the script does not end by a barrier. Use `comm.barrier()` to synchronize all processes of the communicator `comm`. Do it only if it is necessary: barriers can harm performances... – francis Apr 08 '16 at 17:17
  • @francis Could you post your comment as an answer please? You answered my question. – kilojoules Apr 08 '16 at 17:31

1 Answers1

1

In the code you posted, all processes are reading the same file and compute the same thing. But the only process printing the result is process 0. This is not parallel computing, this is doing the same thing multiple times!

Some processes can finish this script before others since the script does not end by a barrier. Use comm.barrier() to synchronize all processes of the communicator comm. Do it only if it is necessary: barriers can harm performances...

francis
  • 9,525
  • 2
  • 25
  • 41