I need to read a .vcf.gz file from pentaho. I can read it from "Text file input" in "Content" tab setting "compressed" to "GZ".
-First of all i need to skip the headers ( basically every row with # at begin).
-Second i need to insert a new column where at every row i insert the file name.
E.g.
My file is:
#header
#header
#header
# chr pos ref alt
chr1 3 A A
What I want is:
chr1 3 A A id_001 (Taken readeing file name)
How can I achieve this?