0

I am given an array of lines from a text file. They look similar to this, and will always be structured like this:

            Full         Tue Aug 27 10:59:43 2019                 1
     Incremental         Tue Aug 27 11:16:41 2019                 1
     Incremental         Tue Aug 27 11:25:28 2019                 1
     Incremental         Tue Aug 27 13:37:29 2019                 1

Based on the above output, I do not believe these 3 columns qualify as fixed width... as you can see the date format can and will probably change based on the date string, as well, line one contains 4 characters in column one row one, while the same column contains 11 in row's 2 through end...

How can I parse the date from these lines, so my list is this instead:

Tue Aug 27 10:59:43 2019
Tue Aug 27 11:16:41 2019
Tue Aug 27 11:25:28 2019
Tue Aug 27 13:37:29 2019

I am sure grep or sed is probably the answer I need, I just don't know much about either.

Kevin
  • 133
  • 1
  • 2
  • 14
  • 1
    You can use `cut` if fixed width is sound there. – 178024 Aug 27 '19 at 18:15
  • @uprego I don't think it will be fixed length based on the date format in the text it would vary based on the month, and day we're on – Kevin Aug 27 '19 at 18:19
  • I don't quite understand your comment :/ but if you want to explain further, your question is a perfect canvas. :) – 178024 Aug 27 '19 at 18:26
  • question editted with this: Based on the above output, I do not believe these 3 columns qualify as fixed width... as you can see the date format can and will probably change based on the date string, as well, line one contains 4 characters in column one, while the same column contains 11 in row's 2 through end... – Kevin Aug 27 '19 at 18:33
  • 1
    I think I mighta got your point now. I hope someone who knows well the answers set can point to a relevant duplicate. If this gets tumbleweed just ping. – 178024 Aug 27 '19 at 18:39
  • 1
    I mean not _get tumbleweed_ literally as I've heard it's been killed, but if it _becomes tumbleweed_ figuratively. – 178024 Aug 27 '19 at 18:42
  • :) appreciate it. i think if I split each line based on a space delimiter, I maye be able to piece together a "date string" based on the positional idx values in the resultant array... – Kevin Aug 27 '19 at 18:53
  • as a for instance: `_d_string=${_temp_arr[1]}" "${_temp_arr[2]}" "${_temp_arr[3]}" "${_temp_arr[4]}" "${_temp_arr[5]}` displays just that date string. however, it seems hackish... – Kevin Aug 27 '19 at 18:56

2 Answers2

1

You can use sed and a regular expression to cut out the date of that.

Assuming your data is stored in the file input.

sed -e 's/^\s\+\S\+\s\+\(.*\S\)\s\+\S\+$/\1/g' input 
Tue Aug 27 10:59:43 2019
Tue Aug 27 11:16:41 2019
Tue Aug 27 11:25:28 2019
Tue Aug 27 13:37:29 2019

The first part s/^\s\+\S\+\s\+ matches lines that begin with one or more whitespace character(s), followed by one or more non-whitespace character(s), followed again by one or more whitespace character(s). E.g.:

'            Full         '
'     Incremental         '

Let's look at the last part now \s\+\S\+$. This will match one or more non-whitespace character(s) at the end of the line, preceded by one or more whitespace character(s). E.g.:

'                 1'

The middle part \(.*\S\) is a matching group which can be referenced by \1 and is called backreference. This one matches any character starting after the first match up to one non-whitespace character before the last match.
As already mentioned, \1 is the backreference to the middle part and is printed out.

Thomas
  • 4,225
  • 5
  • 23
  • 28
1

Check if awk can help.

$ cat abc.txt
            Full         Tue Aug 27 10:59:43 2019                 1
     Incremental         Tue Aug 27 11:16:41 2019                 1
     Incremental         Tue Aug 27 11:25:28 2019                 1
     Incremental         Tue Aug 27 13:37:29 2019                 1
$ cat abc.txt  | awk '{print $2" "$3" "$4" "$5" "$6}'
Tue Aug 27 10:59:43 2019
Tue Aug 27 11:16:41 2019
Tue Aug 27 11:25:28 2019
Tue Aug 27 13:37:29 2019
asktyagi
  • 2,860
  • 2
  • 8
  • 25
  • Why print the 3rd field twice? Also, you don't need `cat` and you can separate the fields with comma (output field separator defaults to space), e.g. `awk '{print $2,$3,$4,$5,$6}' abc.txt`. – Freddy Aug 28 '19 at 05:48
  • this consistently did the trick. thank you for the help – Kevin Sep 06 '19 at 17:56