0

The goal is to convert the date format %Y-%m-%s %H:%M:%S %z to seconds since EPOCH with POSIX tools.

I currently have a problem with the formula from the POSIX's Seconds Since the Epoch: I get a different result than GNU/BSD date.

Here's my code, simplified for testing purposes:

#!/bin/bash
  
TZ=UTC date $'+  date: %Y-%m-%d %T %z \nexpect: %s \n %Y %j %H %M %S %z' |
awk '
    NR <= 2 { print; next }
    {
        # $0: "YYYY jjj HH MM SS zzzzz"
        $1 -= 1900
        epoch = $5 + $4*60 + $3*3600 + int($2 + ($1-70)*365 + ($1-69)/4 - ($1-1)/100 + ($1+299)/400)*86400
        print "result:", epoch
    }
'
  date: 2022-10-21 22:02:56 +0000 
expect: 1666389776 
result: 1666476176

What am I doing wrong?

Fravadona
  • 13,917
  • 1
  • 23
  • 35
  • 1
    i noted the epochs diff is exactly 1 day's worth, which means simply change `$2` to `$2 - 1`, everything else stay the same, and should return the correct value – RARE Kpop Manifesto Oct 26 '22 at 09:28

2 Answers2

1

By reading the specification thoroughly I found two problems:

  • tm_yday value should be date +%j minus one:

days since January 1 of the year

  • Integer arithmetic is needed:

The divisions in the formula are integer divisions

The following code now works; I also added an optional support for timezones in %z format:

#!/bin/bash

date $'+  date: %Y-%m-%d %T %z \nexpect: %s \n %Y %j %H %M %S %z' |
awk '
    NR <= 2 { print; next }
    {
        # $0: "YYYY jjj HH MM SS zzzzz"
        tm_year = $1 - 1900
        tm_yday = $2 - 1
        tm_hour = $3
        tm_min  = $4
        tm_sec  = $5
        zoffset = \
            int(substr($6,1,1) "1") * \
            (substr($6,2,2)*3600 + substr($6,4,2)*60)

        epoch = \
            tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 + \
            (tm_year-70)*31536000 + int((tm_year-69)/4)*86400 - \
            int((tm_year-1)/100)*86400 + int((tm_year+299)/400)*86400 - \
            zoffset

        print "result:", epoch
    }
'
  date: 2022-10-22 01:19:23 +0200 
expect: 1666394363 
result: 1666394363

ASIDE

Here's a sample implementation of GNU's mktime function for POSIX awk. It converts a date in the format %Y %m %d %H %M %S %z to seconds since EPOCH. When %z isn't provided, it assumes that the timezone is UTC (ie. +0000):

function posix_mktime(datespec,    a,tm_year,tm_yday,tm_hour,tm_min,tm_sec,tm_zoff) {

    # One-time initialization during the first execution
    if ( !(12 in _MKTIME_monthToDaysSinceJan1st) )
        split("0 31 59 90 120 151 181 212 243 273 304 334", _MKTIME_monthToDaysSinceJan1st, " ")

    # datespec: "YYYY mm dd HH MM SS zzzzz"
    split(datespec, a, " ")

    tm_year = a[1] - 1900
    tm_yday = _MKTIME_monthToDaysSinceJan1st[a[2]+0] + a[3] - 1
    if (a[2] > 2 && ((a[1]%4 == 0 && a[1]%100 != 0) || a[1]%400 == 0))
        tm_yday += 1
    tm_hour = a[4]
    tm_min  = a[5]
    tm_sec  = a[6]
    tm_zoff = (substr(a[7],1,1) "1") * (substr(a[7],2,2)*3600 + substr(a[7],4,2)*60)

    return tm_sec + tm_min*60 + tm_hour*3600 + tm_yday*86400 \
         + (tm_year-70)*31536000 + int((tm_year-69)/4)*86400 \
         - int((tm_year-1)/100)*86400 + int((tm_year+299)/400)*86400 \
         - tm_zoff
}
Fravadona
  • 13,917
  • 1
  • 23
  • 35
1

UPDATE 2

Turns out …. that formula could be much further simplified down to just :

For any Jan, return

31

otherwise, return

floor( 365 * month-number / 12 )

   + (  year is leap        )

   - ( month is before-July )         

UPDATE 1

@Fravadona : I found a much cleaner formula to get month-end cumulative julian/ordinal days :

  • for any Jan

    return 31
    
  • otherwise, assign floor( 365 * month-number / 12 ) into => days_j

    • for leap-year months after June,

       return days_j + 1
      
    • for common-year months before July,

       return days_j - 1
      
    • otherwise,

       return days_j
      

NOTE :: that 365 in the formula applies for all years - don't change it to 366 for leap years


@Fravadona : you don't need to hard-code in the cumulative julian days by month-ends :

function ____(__,___,_)
{
       #  __|   mm:
       # ___| yyyy:
       #    |--> cumulative julian days by month-end
       return \
       ((__=int(__))<(_+=_^=_<_) ? !_ : -_+(___=="" ||
       (___=int(___)) % (_+_)    ? !_ : \
       ! (___%((_+_)*((_+_*_*_)^_)^!(___%(_+_*_*_)^_)))))\
        \
       + (++_+_^_--)*__\
          \
       + substr((_=((--_+(++_+_*_*_)^_)^_+_)*(_^++_+_)) \
             (_+(((_+=_^=!_)*++_)^++_*++_*(_-+-++_+_*++_)-+--_)),__,_^(_<_))
}

Assuming pre-Gregorian years follow the same leap-year rules, and assuming month input to be [1, 12] and year-input is [1, 2^53), this function alone should return the exact cumulative # of julian days till end of month, inclusive, for any year-month combination

(missing year-input defaults to being treated as non-leap year):

1900  1  31  2000  1  31  2002  1  31  2023  1  31  2024  1  31
1900  2  59  2000  2  60  2002  2  59  2023  2  59  2024  2  60
1900  3  90  2000  3  91  2002  3  90  2023  3  90  2024  3  91
1900  4 120  2000  4 121  2002  4 120  2023  4 120  2024  4 121

1900  5 151  2000  5 152  2002  5 151  2023  5 151  2024  5 152
1900  6 181  2000  6 182  2002  6 181  2023  6 181  2024  6 182
1900  7 212  2000  7 213  2002  7 212  2023  7 212  2024  7 213
1900  8 243  2000  8 244  2002  8 243  2023  8 243  2024  8 244

1900  9 273  2000  9 274  2002  9 273  2023  9 273  2024  9 274
1900 10 304  2000 10 305  2002 10 304  2023 10 304  2024 10 305
1900 11 334  2000 11 335  2002 11 334  2023 11 334  2024 11 335
1900 12 365  2000 12 366  2002 12 365  2023 12 365  2024 12 366
RARE Kpop Manifesto
  • 2,453
  • 3
  • 11
  • Nice, I didn't know that you could calculate it; the method I use is to store the default cumulative julian days by month **start** then check if the year is a leap one or not – Fravadona Oct 22 '22 at 08:11
  • @Fravadona : you could `"""calculate"""` it by dynamically generating a reference number/string that contains notations regarding how the months are in excess of a constant rate of 30 days / month, which is what that final part about `substr( )` is doing …. but yes it's not really math per se. e.g. for a non-leap year, the ***month-end*** numbers are 31, 59, 90, 120, so March and April would have zero excess, while Jan vs. Feb's excess could be done as ::::::::::::::::::::::::::::::::::::::::::::::::::::: `30*(month #) + (-1)^(month==Feb)` …. then expand along the same high-level concept – RARE Kpop Manifesto Oct 22 '22 at 11:45
  • @Fravadona : …. the same first 4 months for a leap year would be ::::::::::::::::::::::::::::::::::::::::::::::::::: `30*(month #) + (month != Feb)`, since `Jan` `March` `April` all have `+1` excess etc – RARE Kpop Manifesto Oct 22 '22 at 11:48