how to count the number of lines in a text file that start with a date

Question

I have a file that has the content as

2004-10-07     cva        create file ...
2003-11-11     cva        create version ...
2003-11-11     cva        create version ...
2003-11-11     cva        create branch ...

now I want to count the number of lines that start with date in this particular file. How can I do that

if I use wc -l <file.txt>
it gives me total number of lines(5 in my case whereas I want is count should be 4)

`grep "[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}" filename | wc -l` should give number of lines for this particular format. However, it can be better with awk. — acsrujan, Feb 08 '17 at 16:30

Shakiba Moshiri · Answer 1 · 2017-02-08T18:03:01.290

0

An easy and simple way with: Perl

your file

2004-10-07     cva 
2004-10-04             
anything
2004-10-07     cva 
anything
2004-10-07     cva 
2004-10-07     cva

you need
perl -lne ' ++$n if /^\d+-\d+-\d+/; print $n' your-file

output

Count and only print sum
perl -lne ' ++$n if /^\d+-\d+-\d+/ ;END{ print $n}' your-file

output
5

with egrep -c count the match numbers
cat your-file | egrep -c '^[0-9]+-[0-9]+-[0-9]+'

output
5

edited Feb 08 '17 at 18:03

answered Feb 08 '17 at 17:21

Shakiba Moshiri

21,040
2
34
44

How can I enhance this by printing the date with the number in the same line separated by comma "," ? – Khaled Annajar Jul 12 '23 at 15:43

dawg · Accepted Answer · 2023-07-12T16:05:23.013

Given:

$ cat file
2004-10-07     cva        create file ...
no date
2003-11-11     cva        create version ...
no date
2003-11-11     cva        create version ...
no date
2003-11-11     cva        create branch ...

First figure out how to run a regex on each line of the file. Suppose you use sed since it is fairly standard and fast. You could also use awk, grep, bash, perl

Here is a sed solution:

$ sed -nE '/^[12][0-9]{3}-[0-9]{2}-[0-9]{2}/p' file
2004-10-07     cva        create file ...
2003-11-11     cva        create version ...
2003-11-11     cva        create version ...
2003-11-11     cva        create branch ...

Then pipe that to wc:

$ sed -nE '/^[12][0-9]{3}-[0-9]{2}-[0-9]{2}/p' file | wc -l
      4

Or, you can use the same pattern in awk and not need to use wc:

$ awk '/^[12][0-9]{3}-[0-9]{2}-[0-9]{2}/{lc++} END{ print lc }' file
4

Or if you wanted the count of each date:

$ awk '/^[12][0-9]{3}-[0-9]{2}-[0-9]{2}/{cnt[$1]++} END{ for (e in cnt) print e, cnt[e] }' file
2003-11-11 3
2004-10-07 1

Or, same pattern, with grep:

$ grep -cE '^[12][0-9]{3}-[0-9]{2}-[0-9]{2}' file
4

(Note: it is unclear if your date format is YYYY-MM-DD or YYYY-DD-MM You can make the pattern more specific if this is known. )

How can I print the number for each date individually not the total lines that start with a date — Khaled Annajar, Jul 12 '23 at 15:44
Do you mean the each date is counted? Like so: `2003-11-11 3, 2004-10-07 1`? See edit... — dawg, Jul 12 '23 at 16:03
Yes. I found the command that creates a csv file with the result date, count sed -n 's/$^[^ ]*$.*/\1/p' history.txt | sort | uniq -c | awk '{print $2","$1}' > output.csv — Khaled Annajar, Jul 16 '23 at 14:11

how to count the number of lines in a text file that start with a date

2 Answers2