Questions tagged [gnu-parallel]

GNU parallel is a shell tool for executing jobs in parallel using one or more computers.

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel. more...

733 questions
8
votes
1 answer

How to feed a large array of commands to GNU Parallel?

I'm evaluating if GNU Parallel can be used to search files stored on a system in parallel. There can be only one file for each day of year (doy) on the system (so a maximum of 366 files per year). Let's say there are 3660 files on the system (about…
Say No To Censorship
  • 537
  • 1
  • 15
  • 32
7
votes
1 answer

How do xargs and gnu parallel differ when parallelizing code?

Here's a basic question. I'm curious as to how do xargs and gnu parallel differ when parallelizing code? And are there use cases in which you'd use one over the other? I ask this because I have seen answers to parallelization questions where using…
Kleber Noel
  • 303
  • 3
  • 9
7
votes
3 answers

Installing GNU-Parallel: How to enter "will cite" from docker build?

In the docker file: from debian:latest RUN apt-get install parallel RUN parallel --citation <<< "will cite" And the docker build simply does not complete because of this entry process. How to install parallel?
Chris
  • 28,822
  • 27
  • 83
  • 158
7
votes
2 answers

Is there a way to run one job many times using GNU parallel?

I can see how easy it is to run a parallel job on multiple input but is there no other way to run the same job in parallel multiple times with putting the command in a file and repeating it many times? parallel -j+0 ::: './dosomejob.sh' but tell…
Jon Scobie
  • 490
  • 4
  • 10
7
votes
1 answer

Add more cores to the parallel running processes in GNU parallel

I'm using GNU parallel to run several jobs in parallel. I was wondering whether GNU parallel includes a command which allows to add n more cores to the processes that are already running in parallel. Do you have some suggestions?
CafféSospeso
  • 1,101
  • 3
  • 11
  • 28
7
votes
2 answers

GNU Parallel: split file into children

Goal Use GNU Parallel to split a large .gz file into children. Since the server has 16 CPUs, create 16 children. Each child should contain, at most, N lines. Here, N = 104,214,420 lines. Children should be in .gz format. Input File name:…
fire_water
  • 1,380
  • 1
  • 19
  • 33
7
votes
1 answer

Split up text and process in parallel

I have a program that generates lots (terabytes) of output and sends it to stdout. I want to split that output and process it in parallel with a bunch of instances of another program. It can be distributed in any way, as long as the lines are left…
Craden
  • 145
  • 6
7
votes
1 answer

GNU parallel - keep output colored

I'm parallelizing some 0-args commands (scripts/whatever) which have colored outputs, but when parallel prints the output it's colorless (unless I use the -u option, but then it's unordered). Is there a way to change that? The line I'm using…
elad
  • 349
  • 2
  • 11
7
votes
1 answer

GNU parallel with variable sequence?

I want to run a program prog in parallel using GNU's parallel, with an argument that takes a value in a sequence. For example: parallel prog ::: {1..100} However, I don't know the upper bound of the sequence in advance, so I would like to be able…
a06e
  • 18,594
  • 33
  • 93
  • 169
7
votes
1 answer

GNU parallel with rsync

I'm trying to run some instances of rsync in parallel using ssh with GNU parallel. The command I'm running is like this: find /tmp/tempfolder -type f -name 'chunck.*' | sort | parallel --gnu -j 4 -v ssh -i access.pem user@server echo {}\; rsync…
Daivid
  • 627
  • 3
  • 12
  • 22
7
votes
2 answers

split STDIN to multiple files (and compress them if possible)

I have program (gawk) that outputs stream of data to its STDOUT. The data processed is literally 10s of GBs. I don't want to persist it in a single file but rather split it into chunks and potentially apply some extra processing (like compression)…
msciwoj
  • 772
  • 7
  • 23
7
votes
2 answers

How can I stop gnu parallel jobs when any one of them terminates?

Suppose I am running N jobs with the following gnu parallel command: seq $N | parallel -j 0 --progress ./job.sh How can I invoke parallel to kill all running jobs and accept no more as soon as any one of them exits?
Ant Man
  • 83
  • 1
  • 4
6
votes
1 answer

How to Install GNU Parallel on Windows 10 using git-bash

Has anyone been able to successfully use GNU Parallel on Windows 10 with git-bash? Is it possible? - If so, how? Background: I'm having trouble installing GNU Parallel and using it, and it got me thinking - maybe git-bash is holding me back? I'm…
Jeremy Iglehart
  • 4,281
  • 5
  • 25
  • 38
6
votes
2 answers

How to lint all the files recursive while printing out only files that have an error?

I want to lint all the files in the current (recursive) directory while printing out only files that have an error, and assign a variable to 1 to be used after the linting is finished. #!/bin/bash lint_failed=0 find . -path ./vendor -prune -o -name…
Michael Delle
  • 73
  • 1
  • 7
6
votes
2 answers

GNU parallel: execute one command parallel for all files in a folder

I am trying to parallelize particle simulations with different parameters to save some time. Therefore I wanted to use GNUparallel to run a bash script for the different parameters. The script reads a file and then performs the simulation eg : $bash…
Physicus
  • 61
  • 1
  • 3
1 2
3
48 49