3

Im trying to parallelize some tasks that needs to be processed on real time, so i was using --line-buffer. I was processing very long strings, but then i noticed that sometimes it hits the line lenght limit, making a command line too long error, so i decided to pipe them

But when i use the --pipe option, --line-buffer stops working
I tested with simpler commands, and the issue still occurs

# Returns instantly, but pass the data as args
(echo 1; echo 2; sleep 100) | parallel -j1 --lb cat
# Pass the data to STDIN, but only after 100 seconds
(echo 1; echo 2; sleep 100) | parallel -j1 --lb --pipe cat

Im using parallel 20190422 on Arch Linux

Andre Augusto
  • 84
  • 1
  • 8
  • I'm getting a different result for your first example. I get `cat: 1: No such file or directory` immediately, but I don't get `cat: 2: No such file or directory` until the sleep is over. Can you confirm that you get both immediately? – that other guy Jun 05 '19 at 16:29
  • Yep, i get both immediatly – Andre Augusto Jun 05 '19 at 16:31

1 Answers1

1
# Pass the data to STDIN, but only after 100 seconds
(echo 1; echo 2; sleep 100) | parallel -j1 --lb --pipe cat

This is due to GNU Parallel reads 1 MB by default. So GNU Parallel waits for more input. Only after 100 sec is the STDIN closed, and GNU Parallel gets an EOF.

You can probably do something like this:

(echo 1; echo 2; echo 3; sleep 100) | parallel -j1 --block 1 -N1 --lb --pipe 'date;cat'

But if the lines are much longer, then increase --block.

Ole Tange
  • 31,768
  • 5
  • 86
  • 104
  • Your example didn't worked, it outputs `parallel: Warning: A record was longer than 1. Increasing to --blocksize 3.` and waits for 100s – Andre Augusto Jun 06 '19 at 01:51
  • Your edited answer prints only the first one, and then waits 100 seconds... I need to use some lines between 10kb and 6Mb – Andre Augusto Jun 09 '19 at 19:38
  • You need to change your question then. The above starts jobs 1 and 2 immediately. It of course waits for job 3 (as it is waiting for standard input to close). It outputs 1 immediately. 2 will only be outputted when 3 is started (unless you use `--ungroup` - but I assume you need to process the output further). – Ole Tange Jun 10 '19 at 07:41