Questions tagged [gnu-parallel]

GNU parallel is a shell tool for executing jobs in parallel using one or more computers.

GNU parallel is a shell tool for executing jobs in parallel using one or more computers. A job can be a single command or a small script that has to be run for each of the lines in the input. The typical input is a list of files, a list of hosts, a list of users, a list of URLs, or a list of tables. A job can also be a command that reads from a pipe. GNU parallel can then split the input and pipe it into commands in parallel. more...

733 questions
6
votes
1 answer

bash append file from multiple thread

I'm working on big data, I'm trying to parallelize my process functions. I can use several threads and process every user is a different thread (I have 200k users). Every thread should append the first n lines of a file that produce, in an output…
Progeny
  • 672
  • 1
  • 11
  • 25
6
votes
3 answers

How to run grep in parallel on single lines from a list

I am a beginner with bash. I need some help in making this job more efficient. while read line do echo "$line" file="Species.$line" grep -A 1 "$line"…
Danny
  • 63
  • 4
6
votes
3 answers

GNU Parallel - redirect output to a file with a specific name

In bash I am running GnuPG to decrypt some files and I would like the output to be redirected to a file having the same name, but a different extension. Basically, if my file is named file1.sc.xz.gpg the file which comes out after running the…
Crista23
  • 3,203
  • 9
  • 47
  • 60
6
votes
1 answer

How to execute one command in multiples directories using "GNU parallel"?

Every day I have to update a bunch of repositories and also execute in some of them another command (from CARTON, Perl module dependency manager). I use always a loop to do that but I want do it parallely with GNU parallel if it is possible but I…
Noob_Number_1
  • 725
  • 5
  • 20
6
votes
2 answers

Parallel Iterating IP Addresses in Bash

I'm dealing with a large private /8 network and need to enumerate all webservers which are listening on port 443 and have a specific version stated in their HTTP HEADER response. First I was thinking to run nmap with connect scans and grep myself…
skrskrskr
  • 83
  • 1
  • 5
6
votes
2 answers

parallel grep pattern multiple files

I'm searching successfully with this command : search for a list of suspicious IPs from a txt file ips.txt in a logs directory (compressed files). root@yop# find /mylogs/ -exec zgrep -i -f ips.txt {} \; > ips.result.txt I want now to use parallel…
mastarah
  • 61
  • 1
  • 2
6
votes
2 answers

GNU parallel output progress while output to file

I have a simple bash script to run: cat full_path.csv | parallel --progress -j +0 'echo -n {},; pdfgrep -c [^_] {};' > path_count.csv Parallel's progress indicator "--progress", writes into the file path_count.csv. I only want echo {} and pdfgrep…
Alvin Das
  • 95
  • 2
  • 2
  • 6
5
votes
1 answer

GNU parallel: assign one thread for each node (directories and sub* directories) of an entire tree from a start directory

I would like to benefit from all the potential of parallel command on macOS (it seems there exists 2 versions, GNU and Ole Tange's version but I am not sure). With the following command: parallel -j8 find {} ::: * I will have a big performance if…
user1773603
5
votes
2 answers

Running bash jobs in parallel with predefined order prioritization

I want to run 3 jobs (A, B, C) on 2 cores of a machine with >2 cores. I know that: runtime(A)>runtime(C) runtime(B)>runtime(C) It is unknown in advance if runtime(A)>runtime(B) or runtime(A)
ddrichel
  • 53
  • 3
5
votes
1 answer

Grepping a single regular expression for a very large file

file.xml is a large 74G file, I have to grep a single regular expression against it as fast as possible. I'm trying to do this by using GNU parallel: parallel --pipe --block 10M --ungroup LC_ALL=C grep -iF "test.*pattern" < file.xml How can I…
5
votes
3 answers

zsh and parallel: How to use functions. It says command not found

I have a script file filename: test_sem_zsh.sh main() { echo "Happy day" } export -f main sem --id testing --fg main I am trying to run it using zsh $ zsh test_sem_zsh.sh test_sem_zsh.sh:export:4: invalid option(s) zsh:1: command not found:…
Santhosh
  • 9,965
  • 20
  • 103
  • 243
5
votes
1 answer

GNU Parallel - multiple commands

I'd like to run several long-running processes on several inputs. E.g.: solver_a problem_1 solver_b problem_1 ... solver_b problem_18 solver_c problem_18 I know how to run multiple arguments for the same command - that is the core use case. This is…
Leo
  • 2,775
  • 27
  • 29
5
votes
1 answer

GNU Parallel: how to pass job id to command

Suppose I am running gnu parallel on an array of items received from standard in, and split according to some criteria: cat content | parallel -j 4 my_command How do I access the job number such that I can pass into command, as an argument, the job…
Chris
  • 28,822
  • 27
  • 83
  • 158
5
votes
1 answer

R and GNU Parallel - How to limit number of cores used

(New to GNU Parallel) My aim is to run the same Rscript, with different arguments, over multiple cores. My first problem is to get this working on my laptop (2 real cores, 4 virtual), then I will port this over to one with 64 cores. Currently: I…
Hector Haffenden
  • 1,360
  • 10
  • 25
5
votes
1 answer

Parallel executing of commands with pipe by GNU Parallel?

Given a task with several commands combined by pipe: cat input/file1.json | jq '.responses[0] | {labelAnnotations: .labelAnnotations}' > output/file1.json Now, there are thousands of input JSON files, and I like to leverage GNU Parallel to…
Drake Guan
  • 14,514
  • 15
  • 67
  • 94