Best way to filter output by date range

Question

I need to filter a row of a csv file by a date.

The file is structured as so:

test121smith@example.com                                           active      01/24/11 10:04   07/23/23 16:56
test121johnson@example.com                                              active      04/07/14 15:56   04/23/21 04:02
test121doe@example.com                                               active      07/27/12 16:24   11/13/12 01:14
test121fritts@example.com                                             active      11/02/10 14:00   09/05/14 11:34
test121violet@example.com                                              active      05/19/11 18:11   03/25/15 12:22
test121brad@example.com                                              active      06/26/14 12:45   03/05/19 20:27

I was able to sort by date using:

awk '{print $1 "," $3 "," $5}' | sort -t "," -n -k 2.7,5 -k 2.8,5 -k 2.1,5 -k 2.2,5 -k 2.4,5 -k 2.5,5

This gives me the rows sorted by date.

Example:

test121smith@example.com,01/24/11,07/23/23
test121johnson@example.com,04/07/14,04/23/21
test121doe@example.com,07/27/12,11/13/12

Is there a way to filter this output by date? Say print only rows after 12/11/22, or only print rows before 12/11/22 for a given field or column?

What I tried:

grep -e '[3-9].$' -e '2[3-9]$' -e '12/[1-3]./22$' myfile.csv

As for output, this command filters the $3 third row. So the output for this sample example is:

test121smith@example.com,01/24/11,07/23/23

This worked, but only for one date in the 3rd column, and I didn't really grasp what it does, or how to change it depending on the date ranges the data is requested.

Thanks!

score 1 · Answer 1 · answered Jul 25 '23 at 07:39

You could use Miller, a nice CSV aware cli.

You could run in example

mlr --nidx --repifs filter 'strptime($3,"%m/%d/%y")>strptime("11/13/12","%m/%d/%y")' input.csv

to filter all the records of the third field ($3) greater than 11/13/12, to have

test121johnson@example.com active 04/07/14 15:56 04/23/21 04:02
test121brad@example.com active 06/26/14 12:45 03/05/19 20:27

Some notes:

--nidx --repifs to set the data format, index-numbered (toolkit style) with field sepator repeated (the space)
filter, the verb to apply filters to fields
strptime, the function to set the date format.

score 0 · Answer 2 · edited Jul 25 '23 at 07:41

0

As you have already sorted list, try below commands

$ cat a.txt

test121smith@example.com,01/24/11,07/23/23 

test121johnson@example.com,04/07/14,04/23/21

test121doe@example.com,07/27/12,11/13/12

Filter all lines before string matched including matched line

$ cat a.txt | sed '/04\/23\/21/q' # use escape sequence for date 04\/23\/21

test121smith@example.com,01/24/11,07/23/23

test121johnson@example.com,04/07/14,04/23/21

Filter all lines after string matched including matched line

$ cat a.txt | sed -n '/04\/23\/21/,$p' # use escape sequence for date 04\/23\/21 

test121johnson@example.com,04/07/14,04/23/21

test121doe@example.com,07/27/12,11/13/12

edited Jul 25 '23 at 07:41

Esa Jokinen

46,944
3
83
129

answered Jul 24 '23 at 15:02

asktyagi

2,860
2
8
25

When I try either sed command I get "sed: -e expression #1, char 5: unknown command: `2'" I am using version 4.2.2 of sed. – dj423 Jul 24 '23 at 15:08
Can you share what exactly you have ran along with output and attach it to question? – asktyagi Jul 24 '23 at 15:18
Output added. Hope it helps. – dj423 Jul 24 '23 at 15:32
I mean share complete command and it's output which gives you "sed: -e expression #1, char 5: unknown command: `2'". – asktyagi Jul 24 '23 at 15:49

Best way to filter output by date range

2 Answers2