I have a perl program that takes over 13 hours to run. I think it could benefit from introducing multithreading but I have never done this before and I'm at a loss as to how to begin.
Here is my situation: I have a directory of hundreds of text files. I loop through every file in the directory using a basic for loop and do some processing (text processing on the file itself, calling an outside program on the file, and compressing it). When complete I move on to the next file. I continue this way doing each file, one after the other, in a serial fashion. The files are completely independent from each other and the process returns no values (other than success/failure codes) so this seems like a good candidate for multithreading.
My questions:
- How do I rewrite my basic loop to take advantage of threads? There appear to be several moduals for threading out there.
- How do I control how many threads are currently running? If I have N cores available, how do I limit the number of threads to N or N - n?
- Do I need to manage the thread count manually or will Perl do that for me?
Any advice would be much appreciated.