-1

I have a script "epsmat_hdf5_merge.py" that merges contents of several files. However, those files are in their individual folders, each named with a number (0001 0002 ...). I am using the most primitive method to identify the files in the folders:

epsmat_hdf5_merge.py q0001/eps0mat.h5 q0002/eps0mat.h5 q0003/eps0mat.h5 0004/eps0mat.h5 q0005/eps0mat.h5 q0006/eps0mat.h5 q0007/eps0mat.h5 q0008/eps0mat.h5 q0009/eps0mat.h5 q0010/eps0mat.h5

Note here that the command is "epsmat_hdf5_merge.py" followed by all the .h5 files in their respective folders.

I cannot use a simple seq loop because

for i in seq`1 999`
do
epsmat_hdf5_merge.py q$i/eps0mat.h5

will simply do the following:

    epsmat_hdf5_merge.py q0001/eps0mat.h5
    epsmat_hdf5_merge.py q0002/eps0mat.h5
    epsmat_hdf5_merge.py q0003/eps0mat.h5
...

which is only one file followed by the .py script. The merge script hence cannot merge anything at any step.

Any idea how to handle this?

Endnote:

Also, if we change the total number of folders (999) now to a variable (var), how will the syntax be like considering bracket expansion has trouble managing the $var?

Jacek
  • 571
  • 1
  • 3
  • 12
  • Could you please post a code that works as you describe? The current doce does not do what you state it does, so we cannot advice about any improvement because we don't know what are you doing. – Poshi Mar 24 '20 at 09:53
  • Again, the code you posted does not do what you say. Please, post the real code you are using to get your actual results along with the expected results. – Poshi Mar 24 '20 at 10:50
  • the actual code I'm using is the first set, it gets what I want but it is clumsy. The second code field is a general idea of what I want, but it does not work because it is not pooling all the xxx/eps0mat.h5 behind one single epsmat_hdf5_merge.py Can't see which part is unclear.. – Jacek Mar 24 '20 at 10:58

2 Answers2

1

Not sure, but I think you are looking for

epsmat_hdf5_merge.py q{0001..0030}/eps0mat.h5

You should adjust 0001 and 0030 to the actual numbers you are interested in. There are also options for missing numbers, or extra ones. As long as brace expansion works, which won't be the case in old bash versions.

Poshi
  • 5,332
  • 3
  • 15
  • 32
0

Brace Expansion.

for i in q{0001..999}; do 
  echo epsmat_hdf5_merge.py $i/eps0mat.h5
done

Using find to process in batch. bash4+ only

find q{0001..999}/ -type f -name '*.h5' -exec echo epsmat_hdf5_merge.py {} +

If your bash is lower than bash4+ try.

find q[0-9][0-9][0-9][0-9]/ -type f -name '*.h5' -exec echo epsmat_hdf5_merge.py {} +

Edit: As mentioned in the comments a variable being used in the

{0001..999}  

to

{0001..$var}

However it does not work in bash, afaik It only works on zsh because brace expansion happens before variable expansion does.

A work around is to use a c style for loop.

var=999
array=()

for ((i = 1; i <= var; i++)); do
  printf -v num '%04d' "$i"
  array+=("q$num/")
done

find "${array[@]}" -type f -name '*.h5' -exec echo epsmat_hdf5_merge.py {} +
  • Remove the echo if you like what is the output.
Jetchisel
  • 7,493
  • 2
  • 19
  • 18
  • Thanks Jetchisel, but this command is still doing `epsmat_hdf5_merge.py 0001/eps0mat.h5 ;epsmat_hdf5_merge.py 0002/eps0mat.h5;....` instead of `epsmat_hdf5_merge.py 0001/eps0mat.h5 0002/eps0mat.h5 .....` – Jacek Mar 24 '20 at 10:33
  • Do you want to process everything at once? – Jetchisel Mar 24 '20 at 10:36
  • yes. the script is in essence merging all the files given in the command – Jacek Mar 24 '20 at 10:37
  • 1
    Please update your question and add and clarify what you wanted to do. – Jetchisel Mar 24 '20 at 10:38
  • updated the question to clarify the issue. Do you mind explaining what each part of the find command does? It is not showing any output and trying to understand which part is not working. – Jacek Mar 24 '20 at 10:54
  • I think the zero-padding on brace expansion requires `bash` v4+ by the way. – Mark Setchell Mar 24 '20 at 10:57
  • I think OP only wants to run his `epsmat_hdf5_merge.py` script ONE time, and that one execution should have all the filenames as parameters to the single execution. – Mark Setchell Mar 24 '20 at 11:00
  • @Jacek do you have bash3 or higher? – Jetchisel Mar 24 '20 at 11:04
  • I'm running things on the school's server, How to check? the usual seq -w works fine though. and Yes, Mark understands it correctly – Jacek Mar 24 '20 at 11:05
  • @MarkSetchell that is what find is for, to process in batch yes? – Jetchisel Mar 24 '20 at 11:07
  • You will run OP's program once for each file, i.e. lots of times. OP wants to run once only `epsmat_hdf5_merge.py q{0001..999}/...` – Mark Setchell Mar 24 '20 at 11:10
  • I don't think it's the padding issue. The command looks for files qxxxx/eps0mat.h5, but does nothing when those files are found. Both the bash 4+ and bash 4- solutions are giving the same output, which is nothing.. – Jacek Mar 24 '20 at 11:12
  • btw, I updated my question such that now the folders are named qxxxx instead of xxxx since this is what I have. It shouldn't affect the syntax much i suppose. – Jacek Mar 24 '20 at 11:14
  • The command with q added now gives the following error: `find: q{0001..0030}/: No such file or directory` – Jacek Mar 24 '20 at 11:25
  • @Jetchisel, I debugged a little and turns out there was a mistake in my files. I was running the command in the wrong folder. Your first code `find q{0001..999}/ -type f -name '*.h5' -exec echo epsmat_hdf5_merge.py {} +` works. Now, if I have a variable instead of 999, like `var=999; find q{0001..$var}/ ........`, the system reflects that it has a problem finding the files q{0001..}. How does the syntax work to include the variable? – Jacek Mar 25 '20 at 06:20
  • Unfortunately that works only on `zsh` in bash you can use a `c-style` for-loop and save it in an array. Or... using eval... – Jetchisel Mar 25 '20 at 06:37
  • Something like: `var=999; mapfile -t array < <(eval echo q{0001..$var}/); find "${array[@]}" -type f ...` – Jetchisel Mar 25 '20 at 06:43
  • Which gives an error, at least on this side. I have updated the answer. – Jetchisel Mar 25 '20 at 07:40
  • The code in the answer works, but the sequence of $var in the array is jumbled for some reason: `epsmat_hdf5_merge.py ./q0011/eps0mat.h5 ./q0019/eps0mat.h5 ./q0028/eps0mat.h5 ./q0002/eps0mat.h5 ./q0030/eps0mat.h5 ./q0008/eps0mat.h5 ./q0015/eps0mat.h5 ./q0013/eps0mat.h5 ./q0003/eps0mat.h5 ./q0001/eps0mat.h5 ./q0009/eps0mat.h5 ./q0027/eps0mat.h5 ./q0029/eps0mat.h5 ./q0016/eps0mat.h5 ./q0018/eps0mat.h5 ./q0022/eps0mat.h5 ./q0014/eps0mat.h5 ./q0006/eps0mat.h5 ./q0004/eps0mat.h5 ./q0020/eps0mat.h5 ./q0007/eps0mat.h5 ./q0024/eps0mat.h5 ./q0025/eps0mat.h5 ......` – Jacek Mar 27 '20 at 04:26
  • This is what I get when running that: http://sprunge.us/ZNkhlw – Jetchisel Mar 27 '20 at 11:11
  • Interesting... But this error is definitely not because of the commands. I'll figure this out on my own then. – Jacek Mar 30 '20 at 08:56