A potential solution that might be generally applicable (looking forward to hearing if there are more efficient alternatives.
First, an example python script that the question describes:
some-script.py:
import sys
lines = sys.stdin.readlines()
print('>>>START-OF-STDIN<<<')
print(''.join(lines))
print('>>>END-OF-STDIN<<<')
The goal is for the stream of text coming from the pipe to be differentiable.
An example of the producers:
cat a.txt | echo $(base64 -w 0) | cat > pipe &
cat b.txt | echo $(base64 -w 0) | cat > pipe &
cat c.txt | echo $(base64 -w 0) | cat > pipe &
cat d.txt | echo $(base64 -w 0) | cat > pipe &
cat e.txt | echo $(base64 -w 0) | cat > pipe &
A description of the producers:
cat
concatenates entire file and then pipes to echo
echo
displays text coming from sub-command $(base64 -w 0)
and pipes to cat
base64 -w 0
encodes full file contents into a single line
cat
used in this case concatenates the full line before redirecting output to pipe. Without it, the consumer doesn't work properly (try for yourself)
An example of the consumer:
tail -fn +1 pipe | while read line ; do (echo $line | base64 -d | cat | python some-script.py) ; done
A description of the consumer:
tail -fn +1 pipe
follows (-f
) pipe from the beginning (-n +1
) without exiting process and pipes content to read
within a while
loop
- while there are lines to be read (assuming base64 encoded single lines coming from producers), each line is passed to a sub-shell
- In each subshell
echo
pipes the line to base64 -d
(-d stands for decode)
base64 -d
pipes the decoded line (which now spans multiple lines potentially) to cat
cat
concatenates the lines and pipes it as one to python some-script.py
- Finally, the example python script is able to read line by line in exactly the same way as
cat example.txt | python some-script.py
The above was useful to me when a host process did not have Docker permissions but could pipe to a FIFO (named pipe) file mounted in as a volume to a container. Potentially multiple instances of the consumer could happen in parallel. I think the above successfully differentiates content coming in so that the isolated process can process content coming in from named pipe.
An example of the Docker command involving pipe symbols, etc:
"bash -c 'tail -fn +1 pipe | while read line ; do (echo $line | base64 -d | cat | python some-script.py) ; done'"