I have 28 input files and 28 available CPUs. I wrote a python script which uses subprocess to parse input files with QualityAnalysisMain_v2.py. Right now it works fine on one CPU. What I would like to do is to run in pararell each input file on one CPU - 28 runs at the same time.
I have tried approach here: python spreading subprocess.call on multiple CPU cores
especially this code:
import threading
import subprocess
def worker():
"""thread worker function"""
print 'Worker'
subprocess.call(mycode.py, shell=inshell)
return
threads = []
for i in range(5):
t = threading.Thread(target=worker)
threads.append(t)
t.start()
here:
for pliki in os.listdir(input_data):
nazwa = pliki.split(".")[0]
subprocess.call("mkdir " + output_data + nazwa, shell=True)
def SubprocessFiles():
for pliki in os.listdir(input_data):
print("python QualityAnalysisMain_v2.py " + input_data + pliki)
subprocess.call("python QualityAnalysisMain_v2.py " + input_data + pliki, shell = True)
return
threads = []
for i in range(28):
t = threading.Thread(target=SubprocessFiles)
threads.append(t)
t.start()
But it ended up with parsing first input 28 times...
Here is my code - script is running for each file in input directory but on one CPU:
for pliki in os.listdir(input_data):
nazwa = pliki.split(".")[0]
print("python QualityAnalysisMain_v2.py " + input_data + pliki)
subprocess.call("mkdir " + output_data + nazwa, shell = True)
subprocess.call("python QualityAnalysisMain_v2.py " + input_data + pliki, shell = True)
Many thanks for any suggestions.
best, Agata