I have a very very large TSV file. The first line is headers. The following lines contain data followed by tabs or double-tabs if a field was blank otherwise the fields can contain alphanumerics or alphanumerics plus punctuation marks.
for example:
Field1<tab>Field2<tab>FieldN<newline>
The fields may contain spaces, punctuation or alphanumerics. The only thing(s) that remains true are:
- each field is followed by a tab except the last one
- the last field is followed by a newline
- blank fields are filled with a tab. Like all other fields they are followed by a tab. This makes them double-tab.
I've tried many combinations of pattern matching in lua and never get it quite right. Typically the fields with punctuation (time and date fields) are the ones that get me.
I need the blank fields (the ones with double-tab) preserved so that the rest of the fields are always at the same index value.
Thanks in Advance!