I have a large database many tables with several million records each and performance is important. For the purpose of this question, let's use a simpler example.
I am having issues performing a JOIN on two tables on timestamps around DST hour. DST happened this year on November 5th at 2:00 AM in timezone America/New_York. Basically, at 2:00 AM, the clocks were set back to 1:00 AM, so 2017-11-05 1:00:00 "happened twice".
Right now, my server is in UTC and I have the following tables:
Table 1:
2017-11-05 04:00:00 1.00
2017-11-05 05:00:00 2.00
2017-11-05 06:00:00 3.00
Table 2:
2017-11-05 04:00:00 4.00
2017-11-05 05:00:00 5.00
2017-11-05 06:00:00 6.00
In these tables, the timestamps are in UTC and of type TIMESTAMP.
I want to create a view as such:
SELECT t1.time, t1.value, t2.value FROM test1 t1 LEFT JOIN test2 t2 ON t1.time = t2.time
Which produces the following output:
2017-11-05 04:00:00 1.00 4.00
2017-11-05 05:00:00 2.00 5.00
2017-11-05 06:00:00 3.00 6.00
This is correct when my server is set to timezone UTC. However, my users are in eastern Canada, in timezone "America/New_York" and because I don't want them to set their user connection timezone everytime they need to access data, I set the server time to "America/New_York". When I perform the same SELECT statement and the server is set to the EST timezone, I get the following results:
2017-11-05 00:00:00 1.00 4.00
2017-11-05 01:00:00 2.00 5.00
2017-11-05 01:00:00 3.00 5.00
2017-11-05 01:00:00 2.00 6.00
2017-11-05 01:00:00 3.00 6.00
Why is the JOIN performed on the result (timezone independant) and not on the actual timestamp with the timezone?
I should see:
2017-11-05 00:00:00 1.00 4.00
2017-11-05 01:00:00 2.00 5.00
2017-11-05 01:00:00 3.00 6.00
With the 3 rows being in EDT, EST and EST respectively, even though I don't need to see that information.
I don't want to use:
SELECT t1.time, t1.value, t2.value FROM test1 t1 LEFT JOIN test2 t2 ON unix_TIMESTAMP(t1.time) = unix_timestamp(t2.time)
This is for performance reasons, as converting to UNIX_TIMESTAMP for millions of records before the JOIN is very expensive and slow.
As I said, I would rather not set the server time to UTC because I don't want to ask my users to set the timezone to America/New_York every time they want to access data in their local time zone.