I am working on a csv file which includes a column including dates, but dtype of this column is actually just object so I changed it to datetime. This part went without a flaw data wasn't changed except it's datatype. But when I turn this dataframe into a parquet file it turns every single row into the same date and it is not even in the previous date format.
Date format in csv is "%Y-%m-%d" like 2011-01-29. This is the last few steps after working on the dataframe:
df_merged_CA["date"] = pd.to_datetime(df_merged_CA["date"], format = "%Y-%m-%d")
df_merged_CA.to_parquet("merged1.parquet", compression = "gzip", engine = "pyarrow")
I checked if date is in the correct form after convertion to datetime and it was, it was still in the form of 2011-01-29. After this I checked the parquet file to see if the date is in correct form, however I see that every date is turned into dates like this 43060-07-05.03:00:00.000 . I saw that problem may be occuring because of the timezone so I changed datetime convertion to this but nothing changed.
df_merged_CA["date"] = pd.to_datetime(df_merged_CA["date"],
format = "%Y-%m-%d").dt.tz_localize('UTC').dt.tz_convert('Europe/Berlin')