cat files with filenames in .tsv

Question

I have file names provided in a tab separated file.

Ex:

file1 file2 file3
file4 file5
file6 file7 file8 file9 file10
file11 file12

......and so on.

I need to be able to do:

cat file1 file2 file3 > newfile1
cat file4 file5 > newfile2
cat file5 file7 file8 file9 file10 > newfile3

.....

There are a total of 140 lines to this file, and multiple file names per row. Within each row I need to concatenate the files. Each file name has a uniq name, so I need to name the new file something different.

There are leading characters in each file prefix that I would like to use to rename. For example, (file1) A1-2_B1.txt and (file2) A1-4_B1.txt would be concatenated to file A1_B1.txt

Any suggestion? All help is appreciated.

I know I can use

 (cat inputs.txt | -n 140 cat) >> newfile.txt

to use a file with filenames per individual line to make a single new file. However, I am having trouble with the multiple files per line, to make multiple new files.

I'm wondering if I put all output filenames into a text file, such as:

A1_B2.txt
A2_B3.txt
..etc...

and using something like:

 (cat inputs.txt | cat) >> (cat outputs.txt)

if it will work.

Letting us know what all you have tried so far will be appreciated too. — Technext, Jul 24 '14 at 15:59

glenn jackman · Answer 1 · 2014-07-24T17:44:42.897

Use awk to transform the file into a script, then pipe into a shell to execute:

awk '{print "cat", $0, ">", "newfile" ++c}' inputs.txt | sh

If you have an "output names" file that corresponds line-for-line with the input file, then

awk '{getline out < "outputs.txt"; print "cat", $0, ">", out}' inputs.txt | sh

another approach to chepner's bash solution:

paste outputs.txt inputs.txt | while IFS=$'\t' read -a line; do
    cat "${line[@]:1}" > "${line[0]}"
done

chepner · Answer 2 · 2014-07-24T17:52:50.800

1

i=0
while IFS=$'\t' read -a names; do
    cat "${names[@]}" > "newfile$((++i))"
done < inputs.txt

should do the trick. Each line is read into an array, and the contents of that array is used as the argument list to cat.

If you have a separate file that contains the output names:

while IFS=$'\t' read -a names;
      read output <&3; do
    cat "${names[@]}" > "$output"
done < inputs.txt 3< outputs.txt

edited Jul 24 '14 at 17:52

answered Jul 24 '14 at 16:08

chepner

497,756
71
530
681

Is there a way to give an additional file with names for the newfiles with your script? – st.ph.n Jul 24 '14 at 16:31
I also only get one output file. THe one from the last row listed in the inputs. – st.ph.n Jul 24 '14 at 16:48
Strange; what is the name of the single output file? – chepner Jul 24 '14 at 16:59
newfile1 is all. for last line in inputs. – st.ph.n Jul 24 '14 at 17:46
Are you sure you have `i=0` *before* the loop, not inside the loop? – chepner Jul 24 '14 at 17:51

st.ph.n · Accepted Answer · 2014-07-24T18:15:29.343

0

So using the answers from chepner and glenn jackman, I was able to make a script to get exactly what I needed. Using glenn jackman's version of chepners I was able to rename the files given an outputs.txt, and then I easily modified chepners's original to remove the ones from the input file that are not longer needed since the new files were made.

Here 's the combination of both:

#!/usr/bin/bash

echo "Getting inputs and output file names:"

paste $1 $2 | while IFS=$'\t' read -a line; do
        cat "${line[@]:1}" > "${line[0]}"
done

wait

echo "Removing old files"

while IFS=$'\t' read -a names; do
        rm "${names[@]}"
done < $2

~

edited Jul 24 '14 at 18:15

answered Jul 24 '14 at 18:08

st.ph.n

549
2
5
19

To remove the old files, just add `rm "${line[@]:1}"` into the first loop. – glenn jackman Jul 24 '14 at 18:14
that makes sense. thanks. can you comment and edit your answer? and I'll remove mine.. – st.ph.n Jul 24 '14 at 18:16
No, leave this one: removing the files was not a requirement in your question. – glenn jackman Jul 24 '14 at 18:17

score 0 · Answer 4 · answered Jul 24 '14 at 19:11

0

After many different solutions, another perl based:

perl -nle 'system qq{cat $_ > out$.}' < input.txt

(works, if the filenames doesn't contains spaces)

answered Jul 24 '14 at 19:11

clt60

62,119
17
107
194

cat files with filenames in .tsv

4 Answers4