1

How do I parse a string equivalent RFC 3339 to any type of regular DateTime structure? The RFC 3339 date-time format is used in a number of specifications such as the Atom Syndication Format.

Here is example of date time in ATOM(RFC 3339) format:

2005-08-15T15:52:01+04:00
Cœur
  • 37,241
  • 25
  • 195
  • 267
Nick Bondarenko
  • 6,211
  • 4
  • 35
  • 56

1 Answers1

4

Here is a complete, but unfortunately unsatisfactory, and yet portable program among the very latest versions of libc++, libstdc++, VS implementations, which parses the string in the format you show into a std::chrono::system_clock::time_point.

I could not find a DateTime to which you refer. However std::chrono::system_clock::time_point is a "DateTime" structure. std::chrono::system_clock::time_point is a count of some time duration (seconds, microseconds, nanoseconds, whatever) since some unspecified epoch. And you can query std::chrono::system_clock::time_point to find out what its time duration is. And as it turns out, every implementation measures time since New Years 1970 neglecting leap seconds.

#include <chrono>
#include <iostream>
#include <limits>
#include <locale>
#include <sstream>

template <class Int>
// constexpr
Int
days_from_civil(Int y, unsigned m, unsigned d) noexcept
{
    static_assert(std::numeric_limits<unsigned>::digits >= 18,
             "This algorithm has not been ported to a 16 bit unsigned integer");
    static_assert(std::numeric_limits<Int>::digits >= 20,
             "This algorithm has not been ported to a 16 bit signed integer");
    y -= m <= 2;
    const Int era = (y >= 0 ? y : y-399) / 400;
    const unsigned yoe = static_cast<unsigned>(y - era * 400);      // [0, 399]
    const unsigned doy = (153*(m + (m > 2 ? -3 : 9)) + 2)/5 + d-1;  // [0, 365]
    const unsigned doe = yoe * 365 + yoe/4 - yoe/100 + doy;         // [0, 146096]
    return era * 146097 + static_cast<Int>(doe) - 719468;
}

using days = std::chrono::duration
    <int, std::ratio_multiply<std::ratio<24>, std::chrono::hours::period>>;

namespace std
{

namespace chrono
{

template<class charT, class traits>
std::basic_istream<charT,traits>&
operator >>(std::basic_istream<charT,traits>& is, system_clock::time_point& item)
{
    typename std::basic_istream<charT,traits>::sentry ok(is);
    if (ok)
    {
        std::ios_base::iostate err = std::ios_base::goodbit;
        try
        {
            const std::time_get<charT>& tg = std::use_facet<std::time_get<charT> >
                                                           (is.getloc());
            std::tm t = {};
            const charT pattern[] = "%Y-%m-%dT%H:%M:%S";
            tg.get(is, 0, is, err, &t, begin(pattern), end(pattern)-1);
            if (err == std::ios_base::goodbit)
            {
                charT sign = {};
                is.get(sign);
                err = is.rdstate();
                if (err == std::ios_base::goodbit)
                {
                    if (sign == charT('+') || sign == charT('-'))
                    {
                        std::tm t2 = {};
                        const charT pattern2[] = "%H:%M";
                        tg.get(is, 0, is, err, &t2, begin(pattern2), end(pattern2)-1);
                        if (!(err & std::ios_base::failbit))
                        {
                            auto offset = (sign == charT('+') ? 1 : -1) *
                                          (hours{t2.tm_hour} + minutes{t2.tm_min});
                            item = system_clock::time_point{
                                days{days_from_civil(t.tm_year+1900, t.tm_mon+1,
                                                     t.tm_mday)} +
                                hours{t.tm_hour} + minutes{t.tm_min} + seconds{t.tm_sec} -
                                offset};
                        }
                        else
                        {
                            err |= ios_base::failbit;
                        }
                    }
                    else
                    {
                        err |= ios_base::failbit;
                    }
                }
                else
                {
                    err |= ios_base::failbit;
                }
            }
            else
            {
                err |= ios_base::failbit;
            }
        }
        catch (...)
        {
            err |= std::ios_base::badbit | std::ios_base::failbit;
        }
        is.setstate(err);
    }
    return is;
}

}  // namespace chrono
}  // namespace std

int
main()
{
    std::istringstream infile("2005-08-15T15:52:01+04:00");
    std::chrono::system_clock::time_point tp;
    infile >> tp;
    std::cout << tp.time_since_epoch().count() << '\n';
}

This has been tested against libc++, libstdc++-5.0 and VS-2015 and produces respectively:

1124106721000000
1124106721000000000
11241067210000000

On libc++ this is a count of microseconds since New Years 1970, neglecting leap seconds. On libstdc++-5.0 it is a count of nanoseconds, and on VS-2015 it is a count of 100 nanoseconds.

The problem with this solution is that it involves inserting a function into the std namespace. In the future the C++ committee may decide to insert this same function into the same namespace which could invalidate your code.

Another problem with this code is that it is horribly complicated. It is a shame that the standard does not provide a simpler solution.

Another problem with this code is that it does not use the simpler "%F", "%T", and "%z" parsing patterns documented in the C standard (though documented as formatting patterns). I experimentally discovered their use was not portable.

Another problem with this code is that it will require gcc-5.0. If you're running gcc-4.9, you're out of luck. You'll have to parse things yourself. I was not able to test VS implementations prior to VS-2015. libc++ should be ok (though even libc++ does not support "%z").

You can convert the std::chrono::system_clock::time_point back into a "broken down" structure if desired via the formulas here. However if that is your ultimate goal, it would be more efficient to modify the code above to parse directly into your "broken down" structure instead of into a std::chrono::system_clock::time_point.

Disclaimer: Only very lightly tested. I'm happy to update this answer with any bug reports.

Update

In the years since I first gave this answer I have written a library which does all the computations above with a far more concise syntax.

#include "date/date.h"
#include <iostream>
#include <sstream>

int
main()
{
    using namespace date;
    std::istringstream infile{"2005-08-15T15:52:01+04:00"};
    sys_seconds tp;  // This is a system_clock time_point with seconds precision
    infile >> parse("%FT%T%Ez", tp);
    std::cout << tp.time_since_epoch() << " is " << tp << '\n';
}

You can find "date.h" here. It is a free, open-source, header only library. At the this link there are also links to full documentation, and for "date.h" even a video tutorial. Though the video tutorial was created prior to the implementation of the parse function.

The output of the above program is:

1124106721s is 2005-08-15 11:52:01

which gives both the seconds since epoch (1970-01-01 00:00:00 UTC), and the date/time in UTC (taking the offset into account).

On the off chance that you need to count leap seconds since the epoch, another library at this same GitHub link is available, but is not header only and requires a small amount of installation. But using it is a simple modification of the program above:

#include "date/tz.h"
#include <iostream>
#include <sstream>

int
main()
{
    using namespace date;
    std::istringstream infile{"2005-08-15T15:52:01+04:00"};
    utc_seconds tp;  // This is a utc_clock time_point with seconds precision
    infile >> parse("%FT%T%Ez", tp);
    std::cout << tp.time_since_epoch() << " is " << tp << '\n';
}

And the output is now:

1124106743s is 2005-08-15 11:52:01

The difference in code is that "tz.h" is now included instead of "date.h", and utc_seconds is parsed instead of sys_seconds. utc_seconds is still a std::chrono::time_point, but now based on a leap-second-aware clock. The program outputs the same date/time, but the number of seconds since the epoch is now 22s greater as this is the number of leap seconds inserted between 1970-01-01 and 2005-08-15.

Howard Hinnant
  • 206,506
  • 52
  • 449
  • 577