This question is hinted at in this one, but the answer to that question doesn't answer this question at all, and I've conflicting suggestions and hints scattered around.
My problem is relatively simple, but in digging into it, I'm getting a bit tripped up.
Suppose I have a string in a format like this: 2023-06-07 03:04:56 -0700
The goal is to normalize this into an epoch timestamp (time_t
in C). I assumed this would be simple enough, but it seems not. The gotcha here seems to be the -0700
at the end.
It seems that strptime(3)
ignores the %z
modified, possibly, maybe (again, I've conflicting reports as to how this is used, in different implementations, etc.). FWIW, I'm using Linux/glibc so I more care about whether it works there, not that it's not in the C standard.
Playing around with it a little bit, it seemed to me like strptime
does ignore the timezone offset. The hour in the struct tm
is simply the hour in the string. The hour isn't modified based on the timezone offset at all. Supposedly that's what the non-standard tm_gmoff
member is for, but I seem to just get a gigantic value when reading that that is definitely much larger than any UTC offset in seconds, so I'm not sure what to make of that either.
As an example:
#define _XOPEN_SOURCE
#include <stdio.h>
#include <string.h>
#include <time.h>
int main()
{
struct tm tm;
time_t epoch;
char buf[40];
strcpy(buf, "2023-06-07 03:04:56 -0700");
memset(&tm, 0, sizeof(tm));
strptime(buf, "%Y-%m-%d %H:%M:%S %z", &tm);
printf("Parsed datetime %s (hour %d, offset %lu)\n", buf, tm.tm_hour, tm.__tm_gmtoff);
tm.tm_isdst = -1;
setenv("TZ", "US/Eastern");
epoch = mktime(&tm);
printf("Parsed datetime -> epoch %lu\n", epoch); // 7:04AM UTC
epoch = timegm(&tm);
printf("Parsed datetime -> epoch %lu\n", epoch); // 3:04AM UTC
return 0;
}
when run on https://www.onlinegdb.com/online_c_compiler, gives:
Parsed datetime 2023-06-07 03:04:56 -0700 (hour 3, offset 18446744073709526416)
Parsed datetime -> epoch 1686121496
Parsed datetime -> epoch 1686107096
Note that -0700
offset in the string is arbitrary, and the local time zone on the system is also arbitrary. For example, -0700
is Pacific Time, but the system could be in Eastern Time, which is actually completely irrelevant to the problem (i.e. the local time zone should not be used in the conversion, since it's irrelevant - the time zone of the offset should be used instead - and importantly, the local time zone should not mess up the answer).
Above, the correct answer is 10:04AM UTC (what the string obviously should convert to). Blindly using mktime
gives the wrong answer, and timegm
is even more off. The problem seems to be that the offset is not taken into account here. The second answer using timegm
would be correct, if the struct tm
had +7 hours added to it for the offset, or if timegm
added +7 hours to the answer based on something in the struct tm
, such as tm_gmtoff. But neither of those things seems to happen.
Short of writing a manual function to parse the %z
in the time string and manually add this offset to the time_t
, is there a better "builtin" way of doing this with standard functions? (Portability isn't super important here, as long as it works in glibc
.) Given this would seem to be a very common type of conversion, I'm thinking there must be a way to do this properly without manually doing calculations, using gmtime
. I thought this was what tm_gmtoff
was for but it seems otherwise - am I missing something here?