0

I have a shell script to read the data from a YAML file and then do some processing. This is how the YAML file is -

view:
    schema1.view1:/some-path/view1.sql
    schema2.view2:/some-path/view2.sql
tables:
    schema1.table1:/some-path/table1.sql
    schema2.table2:/some-path/table2.sql
end

I want the output as -

schema:schema1
object:view1
fileloc:/some-path/view1.sql

schema:schema2
object:view2
fileloc:/some-path/view2.sql

schema:schema1
object:table1
fileloc:/some-path/table1.sql

schema:schema2
object:table2
fileloc:/some-path/table2.sql

This is how I'm reading the YAML file using the shell script -

#!/bin/bash

input=./file.yaml

viewData=$(sed '/view/,/tables/!d;/tables/q' $file|sed '1d;$d')
tableData=$(sed '/tables/,/end/!d;/end/q' $file|sed '1d;$d')

so viewData will have this data -

schema1.view1:/some-path/view1.sql
schema2.view2:/some-path/view2.sql

and tableData will have this data -

schema1.table1:/some-path/table1.sql
schema2.table2:/some-path/table2.sql

And then I'm using a for loop to separate the schema, object and SQL file -

for line in $tableData; do
        field=`echo $line | cut -d: -f1`
        schema=`echo $field | cut -d. -f1`
        object=`echo $field | cut -d. -f2`
        fileLoc=`echo $line | cut -d: -f2`

        echo "schema=$schema"
        echo "object=$object"
        echo "fileloc=$fileLoc"
done

But I'll have to do the same thing again for the view. Is there any way in shell script like using an array or something else so that I can use the same loop to get data for both view and tables.

Any help would be appreciated. Thanks!

Chuck
  • 39
  • 3

1 Answers1

1

Using (g)awk:

awk -F "[:.]" '/:$/{ s=$1 }{ gsub(" ",""); if($3!=""){ print "schema="$1; print "object="$2; print "fileloc="$3 }}' yaml
  • -F "[:.]" reads input, and separates this on : or . (But using the regular expression [:.].)
  • /:$/{ s=$1 } This will store the group (view or tables) you are currently reading. This is not used anymore, so can be ignored.
  • gsub(" ",""); This will delete all spaced in the input line.
  • if... When you have three fields, checked by a not empty third field, print the info.

output:

schema=schema1
object=view1
fileloc=/some-path/view1
schema=schema2
object=view2
fileloc=/some-path/view2
schema=schema1
object=table1
fileloc=/some-path/table1
schema=schema2
object=table2
fileloc=/some-path/table2

EDIT: Adding the objectType to the output:

awk -F "[:.]" '/:$/{ s=$1 }{ gsub(" ",""); if($3!=""){ print "objectType="$s; "schema="$1; print "object="$2; print "fileloc="$3 }}' yaml

But I do see that I made a mistake....

I would have expected the regular expression /:$/ to find a line that end with a :, but for some reason it does not. (I will have to do some more research to look into that)

It should be, for a working work-around:

awk -F "[:.]" 'NF==2{ s=$1 }NF>2{ gsub(" ",""); if($3!=""){ print "objectType="s; "sch
ema="$1; print "object="$2; print "fileloc="$3 }}' yaml
  • The line with view: has two field, which make NF return the value 2, and view is stored in the variable s.
  • When we have more than two fields, the contents of the variables is printed.
Luuk
  • 12,245
  • 5
  • 22
  • 33
  • Actually I need to loop the output values in another for loop. – Chuck Nov 12 '22 at 10:56
  • From your question it is unclear what is stopping you from looping over the output of this. – Luuk Nov 12 '22 at 11:21
  • sorry, please bear with me...this is kind of new for me. can you please tell me how do I loop over the output values? the same loop should work for table and view in separate iteration. – Chuck Nov 12 '22 at 11:24
  • 1
    [How can I loop over the output of a shell command?](https://stackoverflow.com/questions/35927760/how-can-i-loop-over-the-output-of-a-shell-command) ? – Luuk Nov 12 '22 at 11:26
  • what if I want the object type as well in the output? Like, `objectType=view schema=schema1 object=view1 fileloc=/some-path/view1.sql` Please help! – Chuck Nov 15 '22 at 12:27