I have a SQL DATETIME field DT
that I want to feed into Kafka using Avro. DT
is in local time but has no time zone; that's stored in another column TZ
as an offset from UTC in minutes. It seems that an Avro timestamp-millis would be appropriate for DT
, but I'm confused about conversion, given the lacking time zone. From looking at the connector source code, I think it will just default to UTC.
So that raises the questions:
Will all visibly similar datetimes convert to the same number of milliseconds since the/an epoch, regardless of their nominal time zone? For example, does
(2018-01-01T00:00Z).to_timestamp_ms()
==(2018-01-01T00:00).to_timestamp_ms()
==(2018-01-01T00:00+05).to_timestamp_ms()
?More importantly, is it possible to convert to true UTC after ingestion by subtracting the time zone offset
TZ
from the datetime fieldDT
(which is now in milliseconds since some epoch)? In other words, if the connector incorrectly assumes UTC, and falsely interprets the datetime as UTC, can the true datetime be recovered after that by subtracting the offset?
Details on transformation steps
I think the order of operations is something like this on the Connector (largely out of my control):
tz = read_field_as_int('tz')
dt = read_field_as_string('dt')
parsed_datetime = datetime(dt, timezone=UTC)
message = {
'dt': parsed_datetime.to_timestamp_ms(),
'tz': tz
}
producer.produce(message)
And then later, in the consumer, maybe this would work?
ms_per_min = 60 * 1000
message = consumer.poll()
true_timestamp = message['dt'] - message['tz'] * ms_per_min
true_dt = datetime.from_timestamp(true_timestamp, timezone=UTC)