I am working through the GNU Parallel totorial. In the "More than one argument" section, there is the following example (note: num30000 is a text file with numbers 1 to 30,000 on sequential lines):
For better parallelism GNU Parallel can distribute the arguments between all the parallel jobs when end of file is met.
Running 4 jobs in parallel will split the last line of arguments into 4 jobs resulting in a total of 5 jobs:
cat num30000 | parallel --jobs 4 -m echo | wc -l
Output:
5
My question is: why do we expect 5 total jobs? I am clearly missing a point, although I don't know if it's important. I expected 4 jobs since 30,000 is divisible by 4. I decided to post this question after running the following:
cat num30000 | parallel --jobs 4 -m echo | colrm 12
which results in:
1 2 3 4 5 6
23696 23697
25273 25274
26850 26851
28427 28428
This looks to me like the first echo
command is passed the first 23,695 arguments. Then, the remaining are split into 4 more jobs with argument counts of 1577, 1577, 1577, and 1574. Am I misunderstanding what the call to parallel is supposed to do? Thank you!