I'm new to glue and I'm trying to make the crawler feature extract database tables from some log files. The problem is that the files have a different first row. I have defined a custom Grok classifier that works well as long as I delete the first row, but when I use the original log files it stops working and uses the default glue classifier (which obviously doesn't work for me). I tried adding 'skip.header.line.count'=1 to the table properties (and setting the crawler to not update the schema) but that doesn't work either. Is there a way to write "skip the first line" in the grok pattern?
Asked
Active
Viewed 515 times