I have some .txt
files in dir1
:
file_name_FOO31101.txt
file_name_FOO31102.txt
file_name_FOO31103.txt
file_name_FOO31104.txt
and some related foo.txt
files in dir2
:
file_name_FOO31101_foo.txt
file_name_FOO31102_foo.txt
file_name_FOO31103_foo.txt
file_name_FOO31104_foo.txt
I ultimately want to be able to call a program for pairs of files such that:
Iteration 1
program_call \
--txt file_name_FOO31101.txt,file_name_FOO31102.txt \
--foo file_name_FOO31101_foo.txt,file_name_FOO31102_foo.txt \
--bar file_name_FOO31101_bar.txt,file_name_FOO31102_bar.txt
Iteration 2
program_call \
--txt file_name_FOO31103.txt,file_name_FOO31104.txt \
--foo file_name_FOO31103_foo.txt,file_name_FOO31104_foo.txt \
--bar file_name_FOO31103_bar.txt,file_name_FOO31104_bar.txt
I.e.
file_name_FOO31101.txt,file_name_FOO31102.txt
file_name_FOO31103.txt,file_name_FOO31104.txt
but not
file_name_FOO31102.txt,file_name_FOO31103.txt
An answer from a question I posted yesterday got me started:
#!/bin/bash
txt_files=/path/to/txt
foo_files=/path/to/foo/files
set -- "$txt_files"/*.txt
[[ -e $1 || -L $1 ]] || { echo "No .txt files found in $txt_files" >&2; exit 1; }
# $# = number of command line arguments passed to the script
while (( $# > 1 )); do
stem=$(basename "${1}" )
output_base=$(echo $stem | cut -d '_' -f 1,2,3) # split on '_' and save ID
echo "-> Processing pairs of txt files : $1,$2"
# Add files to array
txt1+=($1)
txt2+=($2)
shift; shift
done
(( $# )) && echo "Left over file $1 still exists"
And then (not knowing a better way of doing this) I repeat the same loop for the foo
files in dir2
:
set -- "$foo_files"/*_foo.txt
[[ -e $1 || -L $1 ]] || { echo "No foo.txt files found in $foo_files" >&2; exit 1; }
# $# = number of command line arguments passed to the script
while (( $# > 1 )); do
stem=$(basename "${1}" )
output_base=$(echo $stem | cut -d '_' -f 1,2,3) # split on '_' and save ID
# Add files to array
foo1+=($1)
foo2+=($2)
echo "-> Processing pairs of foo.txt files : $1,$2"
shift; shift
done
(( $# )) && echo "Left over file $1 still exists"
And then iterate over one of the arrays (all must be the same length) and call program:
# Seeing as all arrays must be the same length, loop over one and print out corresponding values for others
for ((i=0;i<${#txt1[@]};++i)); do
printf "program_call --txt %s,%s --foo %s,%s\n" "${txt1[i]}" "${txt2[i]}" "${foo1[i]}" "${foo2[i]}"
done
Which seems to basically work, printing:
program_call --txt /path/to/txt/file_name_FOO31101.txt,/path/to/txt/file_name_FOO31102.txt --foo /path/to/foo/files/file_name_FOO31101_foo.txt,/path/to/foo/files/file_name_FOO31102_foo.txt
program_call --txt /path/to/txt/file_name_FOO31103.txt,/path/to/txt/file_name_FOO31104.txt --foo /path/to/foo/files/file_name_FOO31103_foo.txt,/path/to/foo/files/file_name_FOO31104_foo.txt
However, I suspect that using the same while loop for all different dirs is a poor way of achieving this result, particularly if I want to call add more options in my program call (e.g. file_name_FOO31101_bar.txt
...).
Is this a sensible way of going about this?