4

I'm having two problems with gnu parallel. Firstly the most interesting:

I have a file in which one line contains two arguments separated by a space. These arguments should be passed to the command together, in a way that the command can recognize them as separate.

i.e.

/path/to/A1 /path/to/A2  
/path/to/B1 /path/to/B2  
/path/to/C1 /path/to/C2

Additionally I have a second variable as an array. I would like parallel to combine all paired arguments from my file abovewith all array values.

I'm almost there, my code is shown below.

parallel  -a $tmpdir/inputfiles.txt $instaldir/ribotagger.pl  \
                    -in {1}         \
                    -region {2}     \
                    -out $exitdir/$folder/ribotag.{2} \
                    ::: ${regions[@]}

In this instance however, parallel interprets {1} not as

/path/to/A1 /path/to/A2

but as

/path/to/A1\ /path/to/A2

Consequently the ribotagger script interprets it as one long argument, causing an immediate halt.

Second problem, I'd like to have the folder parameter differ for every instance of the script that parallel creates. I thought of something like

-out $exitdir/$(echo {1} | cut -d "/" -f 4)/ribotag.{2}

However, as it appears {1} is not recognized within $(stuff) The script requires an output parameter to run.

Laura
  • 105
  • 5

1 Answers1

4

I think you need this:

parallel --colsep ' ' -a inputfiles.txt echo 1={1} 2={2} 3={3} ::: france germany | cat -vet
1=/path/to/C1 2=/path/to/C2 3=france$
1=/path/to/C1 2=/path/to/C2 3=germany$
1=/path/to/B1 2=/path/to/B2 3=germany$
1=/path/to/B1 2=/path/to/B2 3=france$
1=/path/to/A1 2=/path/to/A2 3=germany$
1=/path/to/A1 2=/path/to/A2 3=france$

For the output file, you may be able to use {#} (which is the job number) to formulate something you like.

Ole Tange
  • 31,768
  • 5
  • 86
  • 104
Mark Setchell
  • 191,897
  • 31
  • 273
  • 432
  • 1
    wow you are totally right. I could have sworn the --colsep parameter interprets every collumn as a separate argument, thus making all possible combinations of arguments 1 to 3. However, it does not! As you stated, it makes all combinations of on one hand param 1 + 2 (paired) and on the other hand param 3. – Laura Mar 06 '15 at 14:39
  • Another addition: in stead of using {#} for job number (which is not easily relate-able to the original input) I choose to use {1//} which represents a part of the input path. `-out $exitdir/{1//}/file.extension` – Laura Mar 08 '15 at 10:37
  • If ribotagger.pl can output to stdout (maybe `-out -` ?), then `--results outdir` may be interesting to use instead, as that will create a standardized hierarchy of subdirs. – Ole Tange Mar 09 '15 at 00:04
  • I totally agree @Ole, GNU parallel has implemented that nicely and it is my method of preference. Using `-out - ` did not cross my mind; I will check later if it works. – Laura Mar 09 '15 at 12:41