I have a single task to complete X number of times in Python and I will be using LSF to speed that up. Is it better to submit a job containing several Python scripts which can be run separately in parallel or one Python script that utilizes the multiprocessor module?
My issue is I don't trust LSF to know how to split up the Python code into several processes (I'm not sure how LSF does this). However, I also don't want several Python scripts floating around as that seems inefficient and disorganized.
The task at hand involves parsing six very large ASCII files and saving the output in a Python dict for later use. I want to parse the six files in parallel (they take about 3 minutes each). Does LSF allow Python to tell it something like "Hey, here's one script, but you're going to split it into these six processes"? Does LSF need Python to tell it that or does it already know how to do that?
Let me know if you need more info. I have trouble balancing between "just enough" and "too much" background.