0

I am working on a cluster remotely and give a few thousands of jobs. Some jobs crash early. I need to move the output files of those jobs (smaller than 1KB) to another folder and start them again. I guess find can move them with something like:

find . -size -1000c -exec mv {} ../crashed \;

but I also need to restart these crashed jobs. Output files in a bunch of folders in output folder and I need folder name and file name(without extantion) seperatly.

I guess sed or/and awk can do this easily but i am not sure how. By the way i am working on BASH shell.

I am trying to use cut, which seems to be working:

for  i in $( find . -size -1000c )
do
FOLDER=$(echo "${i%.*}" | cut -d'/' -f2)
FILENAME=$(echo "${i%.*}" | cut -d'/' -f3)
done

But wouldn't it be better using sed or awk? And how?

maynak
  • 173
  • 6

1 Answers1

1

Sed is a stream editor and since you're not changing anything I wouldn't use it in this case. You could use awk instead of cut like this:

FOLDER=$(echo "${i%.*}" | awk -v FS="/" '{ print $2 }')

where the -v FS="/" specifies that the variable FS (field separator, is a slash, kind of the same as what you do with the -d option in cut) and print $2 tells awk to print only the second field.

Same goes for the other instruction you have there. In your case what you have to do is simple enough, so cut actually cuts it :D

I usually use awk for more complicated tasks, involving multiple files and/or mathematical computations.

Edit:

note that I'm using gawk here (the awk implementation by GNU). I'm not sure you can pass a variable value with the -v option in other implementations, they'll have their way to do it.

blue
  • 2,683
  • 19
  • 29