0

The best I can explain is by example.

  1. Create named pipe: mkfifo pipe
  2. Create 5 text files, a.txt, b.txt, c.txt, d.txt, e.txt (they can hold any contents for this example)
  3. cat [a-e].txt > pipe

Of course, because the pipe is not open at the consumer side, the terminal will seem to be busy.

  1. In another terminal, tail -fn +1 pipe

All content is fed through the pipe (consumed and printed out by tail) as expected.

But instead of simply printing out content consumed, I would like each piped text file to be redirected to a command (5 separate processes) that can only handle one at a time:

Something like python some-script.py < pipe but where it would create 5 different instances (one instance per text file content).

Is there any way for the consumer to differentiate between objects coming in? Or does the data get appended and read all as one stream?

dnk8n
  • 675
  • 8
  • 21
  • The data comes out as one continuous undifferentiated stream. *IF* you can trust the writer to behave correctly then you could have it insert additional data into the stream in such a way that the reader can discern the boundaries between the different files and then treat them as separate entities. – ottomeister May 02 '20 at 23:37
  • Thanks. I came to this conclusion by trial and error, but it really helped me that someone more knowledgable was able to confirm my findings. I am thinking along the lines of getting the writer to base64 encode content in a single line `base64 -w 0` so that the reader can decode line by line and pass it to a process. Hitting some brick walls but getting there slowly I think. – dnk8n May 04 '20 at 06:54

1 Answers1

0

A potential solution that might be generally applicable (looking forward to hearing if there are more efficient alternatives.

First, an example python script that the question describes:

some-script.py:

import sys

lines = sys.stdin.readlines()
print('>>>START-OF-STDIN<<<')
print(''.join(lines))
print('>>>END-OF-STDIN<<<')

The goal is for the stream of text coming from the pipe to be differentiable.

An example of the producers:

cat a.txt | echo $(base64 -w 0) | cat > pipe &
cat b.txt | echo $(base64 -w 0) | cat > pipe &
cat c.txt | echo $(base64 -w 0) | cat > pipe &
cat d.txt | echo $(base64 -w 0) | cat > pipe &
cat e.txt | echo $(base64 -w 0) | cat > pipe &

A description of the producers:

  • cat concatenates entire file and then pipes to echo
  • echo displays text coming from sub-command $(base64 -w 0) and pipes to cat
  • base64 -w 0 encodes full file contents into a single line
  • cat used in this case concatenates the full line before redirecting output to pipe. Without it, the consumer doesn't work properly (try for yourself)

An example of the consumer:

tail -fn +1 pipe | while read line ; do (echo $line | base64 -d | cat | python some-script.py) ; done

A description of the consumer:

  • tail -fn +1 pipe follows (-f) pipe from the beginning (-n +1) without exiting process and pipes content to read within a while loop
  • while there are lines to be read (assuming base64 encoded single lines coming from producers), each line is passed to a sub-shell
  • In each subshell
  • echo pipes the line to base64 -d (-d stands for decode)
  • base64 -d pipes the decoded line (which now spans multiple lines potentially) to cat
  • cat concatenates the lines and pipes it as one to python some-script.py
  • Finally, the example python script is able to read line by line in exactly the same way as cat example.txt | python some-script.py

The above was useful to me when a host process did not have Docker permissions but could pipe to a FIFO (named pipe) file mounted in as a volume to a container. Potentially multiple instances of the consumer could happen in parallel. I think the above successfully differentiates content coming in so that the isolated process can process content coming in from named pipe.

An example of the Docker command involving pipe symbols, etc:

"bash -c 'tail -fn +1 pipe | while read line ; do (echo $line | base64 -d | cat | python some-script.py) ; done'"

dnk8n
  • 675
  • 8
  • 21
  • Note, that this will only be atomic when PIPE_BUF is not exceeded. See what is said about PIPE_BUF here - http://man7.org/linux/man-pages/man7/pipe.7.html – dnk8n May 04 '20 at 22:33