2

How can I sort a text file which contains rfc dates?

Eg.:

Sat, 1 Aug 2015 01:48:56 +0200
Sat, 1 Aug 2015 01:25:40 +0200
Sun, 19 Jul 2015 14:47:29 -0300
Sat, 13 Sep 2014 12:13:51 -0300

Thanks!

Cyrus
  • 84,225
  • 14
  • 89
  • 153
thorax
  • 574
  • 1
  • 5
  • 16

3 Answers3

4

Pass each date through the date command turning them into seconds from the epoch followed by the original string, do the sort, and remove the added seconds:

while read date
do    date --date "$date" +"%s $date"
done |
sort -n -k 1,1 |
sed 's/[^ ]* //'
meuh
  • 11,500
  • 2
  • 29
  • 45
  • 2
    +1 With your nice idea and xargs: `xargs -I {} date --date "{}" +"%s {}" < file | sort -n -k 1,1 | cut -d " " -f 2-` – Cyrus Aug 17 '15 at 10:22
  • @Cyrus wow, a one-liner, and less than 80 chars! and faster of course. very good, you should post it as an answer. – meuh Aug 17 '15 at 10:26
  • No. It was your nice idea with the second column. Honor to whom honor. – Cyrus Aug 17 '15 at 10:31
  • 1
    A Schwartzian Transform. Nice. – glenn jackman Aug 17 '15 at 12:48
  • I forgot to mention that I would like to run this on an OS X system. Trying to fix the command for os x. "date" command is a bit different on OS X : it seems that it does not have the --date option. – thorax Aug 17 '15 at 13:07
  • 1
    For Mac OS X, you want `date -jf "%a, %d %b %Y %T %z" +"%s $date" "$date"`. `-j` just outputs the date in the specified format (instead of trying to set the date), and `-f` supplies the format used to parse the given date. – chepner Aug 17 '15 at 13:14
1

Similar idea to @meuh, but a single call to perl instead of calling date once for each line:

perl -MTime::Piece -lne '
        push @dates, [Time::Piece->strptime($_, "%a, %e %b %Y %T %z"), $_] 
    } {
        print join "\n", 
              map {$_->[1]} 
              sort {$a->[0] <=> $b->[0]} 
              @dates
' dates.txt
Sat, 13 Sep 2014 12:13:51 -0300
Sun, 19 Jul 2015 14:47:29 -0300
Sat, 1 Aug 2015 01:25:40 +0200
Sat, 1 Aug 2015 01:48:56 +0200
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
1

If your version of sort implements the -M option (sort by month) (and OS X sort does), you can sort the strings on the three relevant fields:

# First, sort on the 4th field numerically (year)
# In the same year, sort on the 3rd field by month
# In the same year and month, sort on the 2nd field numerically (day)
sort -k4,4n -k3,3M -k2,2n dates.txt
chepner
  • 497,756
  • 71
  • 530
  • 681
  • Thanks! Exactly what I was looking for! :) – thorax Aug 17 '15 at 13:26
  • beware: the example dates are not all in the same timezone. even if you are not interested in the time detail, the date can change if an entry is normalised to UTC, for example. – meuh Aug 17 '15 at 13:32
  • Ah, I overlooked that. There's only a small percentage of inputs that would be affected by that, but there's no way to accommodate them with `sort` alone. – chepner Aug 17 '15 at 13:35
  • @thorax I wouldn't accept this answer; I recommend meuh's or glenn jackman's. – chepner Aug 17 '15 at 13:36
  • I agree. Thanks for pointing it out. I've accepted their answer now. – thorax Aug 17 '15 at 14:40