3

I have a tab delimited file with timestamp in third field which I need to change into epoch in bash.

Sample Input:

xyz@gmail.com   SALE    2017-04-26 12:47:27     30.0    1       201704
xyz@gmail.com   SALE    2017-04-26 12:46:15     20.0    2       201704
xyz@gmail.com   PAYBACK 2017-04-18 08:02:31     95.0    3       201704
xyz@gmail.com   SEND    2017-04-18 08:00:37     4800.0  4       201704
xyz@gmail.com   SEND    2017-04-17 14:59:34     4900.0  5       201704

I tried awk 'BEGIN {IFS="\t"} {$3=system("date -d \""$3"\" '+%s'");print}' file which gives the closest results but it displays epoch in one line then shows the record again in a newline with timestamp value as zero. I require all in a single record with third field replaced.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
Drunk Knight
  • 131
  • 1
  • 2
  • 14

1 Answers1

5

With GNU awk:

$ cat tst.awk
BEGIN { FS=OFS="\t" }
{
    $3 = mktime(gensub(/[-:]/," ","g",$3))
    print
}

$ awk -f tst.awk file
xyz@gmail.com   SALE    1493228847      30.0    1       201704
xyz@gmail.com   SALE    1493228775      20.0    2       201704
xyz@gmail.com   PAYBACK 1492520551      95.0    3       201704
xyz@gmail.com   SEND    1492520437      4800.0  4       201704
xyz@gmail.com   SEND    1492459174      4900.0  5       201704

With other awks:

$ cat tst.awk
BEGIN { FS=OFS="\t" }
{
    cmd = "date -d \"" $3 "\" \047+%s\047"
    if ( (cmd | getline line) > 0 ) {
        $3 = line
    }
    close(cmd)
    print
}

$ awk -f tst.awk file
xyz@gmail.com   SALE    1493228847      30.0    1       201704
xyz@gmail.com   SALE    1493228775      20.0    2       201704
xyz@gmail.com   PAYBACK 1492520551      95.0    3       201704
xyz@gmail.com   SEND    1492520437      4800.0  4       201704
xyz@gmail.com   SEND    1492459174      4900.0  5       201704

wrt your script - there is no builtin awk variable named IFS, system returns the exit status of the last command run, not it's stdout, and you cannot include 's in any '-delimited script invoked from shell.

wrt wanting to do it "in-place", no UNIX editor REALLY does editing in-place but in GNU awk you can use -i inplace to avoid specifying the tmp file name yourself. With any UNIX command, though, you can just do cmd file > tmp && mv tmp file.

Note that this is one of the very few appropriate uses for getline - see http://awk.freeshell.org/AllAboutGetline for other valid uses and, most importantly, caveats and reasons not to use it unless absolutely necessary.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • 1
    Thanks for your help Ed... This solution works. I understood the in-place part and why it could not be done here, but can you please explain or give me some link what was the issue with my script? – Drunk Knight May 08 '17 at 06:13
  • I am somehow not able to edit m comment to remove "here", but I meant that I understood why it could not be done. Just hoping for some link to understand the issue with my script and workaround to that in greater detail. Nonetheless, thanks for your help, I am good to go. :-) – Drunk Knight May 08 '17 at 06:34
  • You're welcome, I recommend you get the book Effective Awk Programming, 4th Edition, by Arnold Robbins. It will explain everything you need to know to start using awk and more. – Ed Morton May 08 '17 at 06:45