Questions tagged [gnu-parallel]

GNU parallel is a shell tool for executing jobs in parallel using one or more computers.

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel. more...

733 questions
3
votes
1 answer

Is it possible to parallelize awk writing to multiple files through GNU parallel?

I am running an awk script which I want to parallelize through GNU parallel. This script demultiplexes one input file to multiple output files depending on a value on each line. The code is the following: #!/usr/bin/awk -f BEGIN{ FS=OFS="\t" } { …
gc5
  • 9,468
  • 24
  • 90
  • 151
3
votes
1 answer

parallelizing nested for loop with GNU Parallel

I am working in Bash. I have a series of nested for loops that iteratively look for the presence of three lists of 96 barcodes sequences. My goal is to find each unique combination of barcodes there are 96x96x96 (884,736) possible combinations.…
Paul
  • 656
  • 1
  • 8
  • 23
3
votes
1 answer

How to run curl commands parallel using GNU parallel

Just recently started programming in bash. I want to run curl command for 'n' times where n is user input. And I want to run in parallel. I came across GNU Parallel. the curl code for running n times is ip1="some ip" ip2="some ip" ip3="some ip" for…
Lucky
  • 359
  • 1
  • 8
3
votes
1 answer

Parallel: How to reference multiple arguments from a function

I have this function and I need it to reference multiple arguments from a function using GNU parallel. foo () { cd ${HOME}/sh/xxx/xxx/xxx/folder_with_scripts bash -H $1 #replace with echo in test run {echo $1 is being echoed} bash -H $2…
3
votes
1 answer

Optimising my script code for GNU parallels

I have a script which queries successfully an API, but is very slow. It will take around 16 hours to get all the resources. I looked at how I could optimise it, and I thought that using GNU parallels (installed on macos via Brew, version 20180522)…
maqueiouseur
  • 89
  • 1
  • 7
3
votes
1 answer

PBS: GNU parallel: hosts allocated vary, multi CPU job, multiple jobs to some hosts

With PBSpro I can request resources to run my job. My parallel cluster job boils down to running the same file multiple times, each time with a different index / job ID. Each task spawns its own sub-processes and each task in total uses 4 CPUs. This…
Jurgen Strydom
  • 3,540
  • 1
  • 23
  • 30
3
votes
3 answers

Why does zsh expand globs for me in a bash script using GNU parallel?

In a bash script, I have a command using rsync: #!/usr/bin/bash -e ... parallel rsync --exclude '*to?be?deleted*' ... --files-from some_file /auto $instance_ip:/somewhere_else/ According to rsync's documentation, their --exclude field has a…
OneRaynyDay
  • 3,658
  • 2
  • 23
  • 56
3
votes
1 answer

Why does GNU parallel affect script speed?

I have some Fortran script. I compile with gfortran and then run as time ./a.out. My script completes, and outputs the runtime as, real 0m36.037s user 0m36.028s sys 0m0.004s i.e. ~36 seconds Now suppose I want to run this script multiple times, in…
3
votes
1 answer

npm install subdirectories using gnu parallel

I am trying to install a set of sub-directories from the parent dir using GNU parallel. I'd like to run certain commands for all directories. Installing ls -d -- */ | grep -v 'node_modules' | parallel "npm i" Removing node_modules ls -d -- */ |…
ThomasReggi
  • 55,053
  • 85
  • 237
  • 424
3
votes
1 answer

increment var in gnu parallel

How can I use an increment variable within parallel? Note the $int variable in the output filename (prefix). I realize order can change, and that is fine, but what's useful is to have integers prefixing the output for downstream work (the length of…
lcb
  • 87
  • 6
3
votes
2 answers

GNU Parallel: Argument list too long when calling function

I created a script to verify a (big) number of items and it was doing the verification in a serial way (one after the other) with the end result of the script taking about 9 hours to complete. Looking around about how to improve this, I found GNU…
oswcab
  • 105
  • 6
3
votes
3 answers

Use GNU parallel to parallelise a bash for loop

I have a for loop which runs a Python script ~100 times on 100 different input folders. The python script is most efficient on 2 cores, and I have 50 cores available. So I'd like to use GNU parallel to run the script on 25 folders at a time. Here's…
roblanf
  • 1,741
  • 3
  • 18
  • 24
3
votes
1 answer

can't execute gnu parallel when called from cron, but works from command line

I have this command in cron on an Amazon EC2 ami linux (centos) instance: 10 9 * * * php -f /var/scripts/schtask/schtask.php Inside this program, it runs this line using the exec() function calling gnu parallel: parallel -j-1 <…
raphael75
  • 2,982
  • 4
  • 29
  • 44
3
votes
1 answer

Gnu Parallel and --link argument

Hi i'm very new to linux and learning to use the terminal and bash. currently i'm running through the GNU Parallel Tutorial. I've come to the section that talks about linking arguments with the --link :::+ if i try using link the terminal says…
adam Wadsworth
  • 774
  • 1
  • 8
  • 26
3
votes
1 answer

use GNU parallel to parallelize a multi-threaded command

I just wrote a python script which involves multi-threading, something like: python myScript.py -cpu_n 5 -i input_file To run the command for my hundreds of input files, I am generating a list (commands.list) of commands for each one: …
Ezekiel Kuo
  • 155
  • 7