8

We have green zone logic where the job has to run only between first Sunday to Saturday, i.e. 7 days starting from first Sunday of every month. I'm using the below awk command to get that, but somewhere it is breaking. I'm just trying for first 3 months i.e Jan to March

seq 75  | awk ' BEGIN {ti=" 0 0 0"} 
function dtf(fmt,dy) { return strftime(fmt,mktime("2020 1 " dy ti)) } 
{ day=dtf("%A %F",$0);mm=dtf("%m",$0);if(day~/Sunday/ || a[mm]) a[mm]++ ; if(a[mm]<8) print day }  '

My output is below, which is incorrect:

Wednesday 2020-01-01
Thursday 2020-01-02
Friday 2020-01-03
Saturday 2020-01-04
Sunday 2020-01-05
Monday 2020-01-06
Tuesday 2020-01-07
Wednesday 2020-01-08
Thursday 2020-01-09
Friday 2020-01-10
Saturday 2020-01-11
Saturday 2020-02-01
Sunday 2020-02-02
Monday 2020-02-03
Tuesday 2020-02-04
Wednesday 2020-02-05
Thursday 2020-02-06
Friday 2020-02-07
Saturday 2020-02-08
Sunday 2020-03-01
Monday 2020-03-02
Tuesday 2020-03-03
Wednesday 2020-03-04
Thursday 2020-03-05
Friday 2020-03-06
Saturday 2020-03-07

Expected output:

Sunday 2020-01-05
Monday 2020-01-06
Tuesday 2020-01-07
Wednesday 2020-01-08
Thursday 2020-01-09
Friday 2020-01-10
Saturday 2020-01-11
Sunday 2020-02-02
Monday 2020-02-03
Tuesday 2020-02-04
Wednesday 2020-02-05
Thursday 2020-02-06
Friday 2020-02-07
Saturday 2020-02-08
Sunday 2020-03-01
Monday 2020-03-02
Tuesday 2020-03-03
Wednesday 2020-03-04
Thursday 2020-03-05
Friday 2020-03-06
Saturday 2020-03-07

How can I adjust the awk command to get the expected output? Any other solutions using other bash tools are also welcome.

TylerH
  • 20,799
  • 66
  • 75
  • 101
stack0114106
  • 8,534
  • 3
  • 13
  • 38

7 Answers7

6

I suggest the following alternative to awk:

#! /usr/bin/env bash
for month in {01..03}; do
  for day in {01..13}; do
    date -d "2020-$month-$day" '+%A %F'
  done |
  grep -A6 -m1 -F Sunday
done

The script is not very efficient, but does the job. For each month, we simply print the dates of the 13 first days in that month. We know that the green zone has to be in that area, therefore we do not need the remaining days of the month.
The date format is Weekday YYYY-MM-DD. We use grep to find and print the first Sunday, print the 6 days behind that Sunday (-A6) and exit because we limited the search to one match (-m1).
The procedure described above is done for each of the months 1 to 3.

Socowi
  • 25,550
  • 3
  • 32
  • 54
4

Here's a simple way to get GNU awk to create a list of dates and day names for any given year:

$ cat tst.awk
BEGIN {
    year = (year == "" ? 2020 : year)
    beg = mktime(year " 1 1 12 0 0")
    for (i=0; i<=400; i++) {
        dateday = strftime("%F %A", beg+24*60*60*i)
        split(dateday,d,/[ -]/)
        if ( d[1] != year ) {
            break
        }
        print d[1], d[2], d[3], d[4]
    }
}

.

$ awk -f tst.awk | head -20
2020 01 01 Wednesday
2020 01 02 Thursday
2020 01 03 Friday
2020 01 04 Saturday
2020 01 05 Sunday
2020 01 06 Monday
2020 01 07 Tuesday
2020 01 08 Wednesday
2020 01 09 Thursday
2020 01 10 Friday
2020 01 11 Saturday
2020 01 12 Sunday
2020 01 13 Monday
2020 01 14 Tuesday
2020 01 15 Wednesday
2020 01 16 Thursday
2020 01 17 Friday
2020 01 18 Saturday
2020 01 19 Sunday
2020 01 20 Monday

I'm starting at noon and looping from 0 to 400 days and breaking when the year changes just so I don't have to try to accommodate DST or leap years or leap seconds in the determination of days in the year in a more accurate calculation.

Just add some code to test for the current month being different from the previous and the current day name being a Sunday and print 7 days starting there, e.g.:

$ cat tst.awk
BEGIN {
    year = (year == "" ? 2020 : year)
    beg = mktime(year " 1 1 12 0 0")
    for (i=0; i<=400; i++) {
        dateday = strftime("%F %A", beg+24*60*60*i)
        split(dateday,d,/[ -]/)
        if ( d[1] != year ) {
            break
        }
        dayName[d[2]+0][d[3]+0] = d[4]
    }
    for (monthNr=1; monthNr<=3; monthNr++) {
        for (dayNr=1; dayNr in dayName[monthNr]; dayNr++) {
            if (dayName[monthNr][dayNr] == "Sunday") {
                for (i=0; i<7; i++) {
                    printf "%s %04d-%02d-%02d\n", dayName[monthNr][dayNr+i], year, monthNr, dayNr+i
                }
                break
            }
        }
    }
}

.

$ awk -f tst.awk
Sunday 2020-01-05
Monday 2020-01-06
Tuesday 2020-01-07
Wednesday 2020-01-08
Thursday 2020-01-09
Friday 2020-01-10
Saturday 2020-01-11
Sunday 2020-02-02
Monday 2020-02-03
Tuesday 2020-02-04
Wednesday 2020-02-05
Thursday 2020-02-06
Friday 2020-02-07
Saturday 2020-02-08
Sunday 2020-03-01
Monday 2020-03-02
Tuesday 2020-03-03
Wednesday 2020-03-04
Thursday 2020-03-05
Friday 2020-03-06
Saturday 2020-03-07

There are slightly more efficient ways to do it but the above is clear and simple and will run in the blink of an eye.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
3

Try this

for i in $(seq 12); do  cal ${i} 2020 | awk -v month=${i} 'NF==7 && !/^Su/{ for (j=0;j<=6;j++){print "2020-"month"-"$1+j;}exit}'

EDIT : Updated code for printing day

for i in $(seq 2); do  cal ${i} 2020 | awk -v month=${i} 'NF==7 && !/^Su/{for (j=0;j<=6;j++){print strftime("%A %F", mktime("2020 " month " " $1+j " 0 0 0"))}exit}'; done;

Demo for Jan and Feb

$for i in $(seq 2); do  cal ${i} 2020 | awk -v month=${i} 'NF==7 && !/^Su/{a[0]="Sunday";a[1]="Monday";a[2]="Tuesday";a[3]="Wednesday";a[4]="Thursday";a[5]="Friday";a[6]="Saturday";for (j=0;j<=6;j++){print a[j]" " "2020-"month"-"$1+j}exit}'; done;
Sunday 2020-1-5
Monday 2020-1-6
Tuesday 2020-1-7
Wednesday 2020-1-8
Thursday 2020-1-9
Friday 2020-1-10
Saturday 2020-1-11
Sunday 2020-2-2
Monday 2020-2-3
Tuesday 2020-2-4
Wednesday 2020-2-5
Thursday 2020-2-6
Friday 2020-2-7
Saturday 2020-2-8
$
Digvijay S
  • 2,665
  • 1
  • 9
  • 21
  • this also works.. can you wrap it with mktime() function. – stack0114106 May 05 '20 at 07:51
  • @Socowi Added functionality to print `Day` . Also, Demo was a typo error. Please check now – Digvijay S May 05 '20 at 08:01
  • @stack0114106 Check my edit. What functionality you need with `mktime() ` ? – Digvijay S May 05 '20 at 08:02
  • replace your print statement with this ````print strftime("%A %F", mktime("2020 " month " " $1+j " 0 0 0"))```` no need for assoc array "a" – stack0114106 May 05 '20 at 08:06
  • 1
    The benefit of this approach is that you don't need GNU `date` or GNU awk (for strftime() and mktime()). @DigvijayS you should always start awk arrays at `1`, not `0` since that's how all awk-provided strings, arrays, and fields are numbered. Given that you can do `split("Monday Tuesday ... Saturday",a)` instead of manually typing out `a[1]="Monday"; a[2]="Tuesday"; ... a[7]="Saturday"`. If 2-letter abbreviations for the day names were acceptable you could just do `NR==2{split($0,a)}`. – Ed Morton May 05 '20 at 18:30
  • 1
    @stack0114106 Thank you. Updated the code. @ Ed Thank you. will initialize array from 0 henceforth. – Digvijay S May 06 '20 at 03:10
3

A (rather wordy - I don't have time to make it shorter:-) ) Perl solution:

#!/usr/bin/perl

use strict;
use warnings;
use feature 'say';

use Time::Piece;
use Time::Seconds;

my $year = shift || localtime->year;

first_week($year, $_) for 1 ..12;


sub first_week {
  my ($yr, $mn) = @_;

  $mn = sprintf '%02d', $mn;

  # Use midday to avoid DST issues
  my $start = Time::Piece->strptime(
    "$year-$mn-01 12:00:00",
    '%Y-%m-%d %H:%M:%S'
  );

  $start += ONE_DAY while $start->day ne 'Sun';

  for (1 .. 7) {
    say $start->strftime('%A %Y-%m-%d');
    $start += ONE_DAY;
  }
}
Dave Cross
  • 68,119
  • 3
  • 51
  • 97
  • @stack0114106: It's because the `ONE_DAY` constant is exactly 24 hours and not all days are 24 hours long (because of DST changes). Starting from midday minimises the chances of that issue moving you to the wrong day. – Dave Cross May 05 '20 at 08:02
  • thank you for considering all possible scenarios and making it robust. – stack0114106 May 05 '20 at 08:08
3

With Perl, using DateTime

use warnings;
use strict;
use feature 'say';

use DateTime;

my $dt = DateTime->new(year => 2020, month => 1, day => 1); 

my $first_sunday = 7 - $dt->day_of_week + 1;  # day of month for first Sun

while (1) { 
    my $day = $dt->day; 
    if ($day >= $first_sunday and $day < $first_sunday + 7) { 
        say $dt->ymd, " (", $dt->day_abbr, ")";
    }
} 
continue { 
    $dt->add(days => 1); 
    if ($dt->day == 1) {  # new month
        last if $dt->month > 3;
        $first_sunday = 7 - $dt->day_of_week + 1;
    }   
}

This keeps a state (on the first in a month in finds out what day the first Sunday is), what is quite suitable if the program is meant to generate and go through all dates from the span of interest.

On the other hand, the program may need to check for a given day; perhaps it runs daily and needs to check for that day. Then it is simpler to see whether the day is between the first and second Sunday in the month

my $dt = DateTime->today;

while ( $dt->add(days => 1)->month <= 3) {

    if ($dt->day_of_week == 7) {           # it's a Sunday
        if ($dt->weekday_of_month == 1) {  # first Sunday in the month
            say $dt->ymd, " (", $dt->day_abbr, ")";
        }
    } 
    else {
        my $sdt = $dt->clone;                # preserve $dt
        $sdt->subtract( $dt->day_of_week );  # drop to previous Sunday
        if ($sdt->weekday_of_month == 1) {   # was first Sunday in the month
            say $dt->ymd, " (", $dt->day_abbr, ")";
        }
    }
}

The while loop around the code is there to facilitate a check.

For days other than Sunday we drop to the past Sunday, to check whether that was the first Sunday in the month. If so, then our day is within the required interval. If the day is a Sunday we only need to check whether it is the first one in the month.

The code can be made a bit more efficient and concise if that matters

if ( (my $dow = $dt->day_of_week) == 7) { 
    if ($dt->weekday_of_month == 1) {
        say $dt->ymd, " (", $dt->day_abbr, ")";
    }
}   
elsif ( $dt->clone->subtract(days => $dow)->weekday_of_month == 1 ) { 
    say $dt->ymd, " (", $dt->day_abbr, ")";
}

... on the account of readability.

zdim
  • 64,580
  • 5
  • 52
  • 81
2
$ printf "%s\n" 2020-{01..03}-01                 \
  | xargs -I{} date -d "{}" "+{} %u"             \
  | join -j3 - <(seq 0 6)                        \
  | xargs -n3 sh -c 'date -d "$1 + 7 days - $2 days + $3 days" "+%A %F"' --

There is some nasty stuff in here, but I'll try to explain. The idea is to compute the day of the week of the first day of the month (assume u). If you know that, you know directly which day is the first Sunday (7-u days later). So from that point forward you only need to compute the next 6 days.

  1. Use brace expansion to generate the months you are interested in
  2. Use xargs to compute the day of the week and output it as YYYY-MM-DD u
  3. Per day, we want to create a list of 7 strings YYYY-MM-DD u d where d runs from 0 to 6. For this we use a nasty join hack. By telling join to join to files on a non-existing field, we create an outer product.
  4. Use xargs in combination with sh to create a command that accepts 3 arguments and do the computation.

This method is now easily expanded to other months and years:

$ printf "%s\n" 20{20..30}-{01..12}-01  | xargs ...

The above looks a bit messy, and you might be more interested in the loop version:

for yyyymm in {2020..2030}-{01..03}; do 
  u=$(date -d "$yyyymm-01" "+%u");
  for ((dd=7-u;dd<14-u;++dd)); do
     date -d "$yyyymm-01 + $dd days" "+%A %F"
  done
done

Previous solution:

This is for the first 3 months of 2020:

$ printf "%s\n" 2020-{01..03}-{01..13}           \
  | xargs -n1 -I{} date -d '{}' '+%A %F'         \
  | awk -F"[- ]" '/Sun/{a[$3]++} a[$3]==1'

This is for the first years 2020 till 2030

$ printf "%s\n" 20{20..30}-{01..12}-{01..13}     \
  | xargs -n1 -I{} date -d '{}' '+%A %F'         \
  | awk -F"[- ]" '/Sun/{a[$2,$3]++} a[$2,$3]==1'

This is understood in 3 steps:

  1. Use brace-expansion to create a list of the first 13 days of months and years you are interested in. This works nicely because the bash starts expanding left to right. This means that the day is the fast-running index. We ask for the first 13 days, because we know that the first Sunday must be within the first 7 days.

  2. Convert the days to the expected format using xargs and date

  3. Use awk to do the filtering.

kvantour
  • 25,269
  • 4
  • 47
  • 72
0

By adding one more condition, I'm able to make it work. a[mm]<8 && a[mm]>0

seq 75  | awk ' 
 BEGIN { ti=" 0 0 0" } 
 function dtf(fmt,dy) { 
  return strftime(fmt,mktime("2020 1 " dy ti)) 
 } 
{  day=dtf("%A %F",$0);
   mm=dtf("%m",$0);
   if(day~/Sunday/ || a[mm]) a[mm]++ ; 
   if(a[mm]<8 && a[mm]>0 )  print day 
 }'

Output:

Sunday 2020-01-05
Monday 2020-01-06
Tuesday 2020-01-07
Wednesday 2020-01-08
Thursday 2020-01-09
Friday 2020-01-10
Saturday 2020-01-11
Sunday 2020-02-02
Monday 2020-02-03
Tuesday 2020-02-04
Wednesday 2020-02-05
Thursday 2020-02-06
Friday 2020-02-07
Saturday 2020-02-08
Sunday 2020-03-01
Monday 2020-03-02
Tuesday 2020-03-03
Wednesday 2020-03-04
Thursday 2020-03-05
Friday 2020-03-06
Saturday 2020-03-07

As a additional note, though I hardcoded 1 for the month, when the day parameter is >31 mktime() just moves to the next month. So in a way you can pass julian day to mktime with month set to 1.

echo -e "1\n31\n32\n60\n61\n366" | awk ' 
 BEGIN { ti=" 0 0 0" } 
 function dtf(fmt,dy) { 
  return strftime(fmt,mktime("2020 1 " dy ti)) 
 } 
{  
   day=dtf("%A %F",$0);
   j=dtf("%j",$0); 
   print j,day 
 }'

Output:

001 Wednesday 2020-01-01
031 Friday 2020-01-31
032 Saturday 2020-02-01
060 Saturday 2020-02-29
061 Sunday 2020-03-01
366 Thursday 2020-12-31
stack0114106
  • 8,534
  • 3
  • 13
  • 38