My problem is in finding the best-practice to resolve the issue of including all requested timezone-aware dates in a selected range. Depending on how I approach it, either the enddate or the startdate fail to return their date from the csv. I cannot get both to work at the same time without tricking the code (which seems like bad coding practice, and might produce issues if the user is entering dates while browsing from a different timezone locale).
In its simplest form, I am trying to return the complete year of 2021 transaction records from a csv file. The URL call to FastAPI to run the code is this:
When the user enters the year parameter in the URL the enddate and startdate get set, then the csv read into a dataframe, the dataframe filtered, and a json returned. In the following code snippet, 'frame' is the dataframe from pd.read_csv code that is not shown:
import datetime as dt
import pandas as pd
from pytz import timezone
import pytz
import json
from fastapi import FastAPI, Response
startdate = year +'-01-01'
enddate = year + '-12-31'
startdate = dt.datetime.strptime(startdate, '%Y-%m-%d') #convert string to a date
startdate = startdate.replace (tzinfo=pytz.utc) #convert date to UTC and make it timezone-aware
enddate = dt.datetime.strptime(enddate, '%Y-%m-%d')
enddate = enddate.replace (tzinfo=pytz.utc)
frame['Date'] = pd.to_datetime(frame['Date'], format='%Y-%m-%d') #turns 'Date' from object into datetime64
frame['Date'] = frame['Date'].dt.tz_convert('UTC') #converts csv 'Date' column to UTC from AEST(+10)
selectdf = frame.loc[frame['Date'].between(startdate, enddate, inclusive=True)] #filters time period
return Response(selectdf.to_json(orient="records"), media_type="application/json")
The csv file containing the data has 'Date' column in AEST timezone (i.e. UTC+10).
In my code I convert everything to UTC and then do the comparison, but with the above code the date for 1st Jan 2021 is not returned, the 2nd Jan is. What is the right way to solve this, I have tried every configuration of timezone changes and so far nothing returned both 1st Jan 2021 and 31st Dec 2021 at the same time.
Sample data:
the csv file from 'Date' Column :
Date 2021-12-31T01:35:59+1000 2021-12-31T01:35:59+1000 2021-12-31T01:09:57+1000 2021-12-31T01:09:57+1000 2021-12-30T03:02:25+1000 2021-12-30T01:52:58+1000 ... 2021-01-02T00:48:29+1000 2021-01-01T02:40:03+1000 2021-01-01T00:30:00+1000 2021-01-01T00:30:00+1000
There is some confusion about the issue:
to clarify the problem, the above code will return the following epoch time (startdate) as the first record: 1609512509000 and the following epoch time (enddate) as the date of the last record: 1640878559000
For me it is translating that as 2nd Jan (I am in AEST timezone in my browser) and 31st Dec respectively, and so the json returned from the above csv data is 2nd Jan to 31st Dec, thus incorrect.
If you run it in your browser it will likely return those epoch dates records relevant to your timezone. This is my problem.