I am currently exploring Py-Polars and are having some difficulties with getting the Date32 format in its dataframe. I have tried the following means:
- Conversion from Pandas to PyPolars directly
import pandas as pd
import pypolars as pyp
a = pd.read_csv(*CSV File*)
b = pyp.from_pandas(a)
The error code is as follows:
Traceback (most recent call last):
File "<pyshell#29>", line 1, in <module>
pyp.from_pandas(a)
File "C:\Users\*Username*\AppData\Local\Programs\Python\Python37\lib\site-packages\pypolars\functions.py", line 235, in from_pandas
pl_s = Series(k, s, nullable=True).cast(datatypes.Date64)
File "C:\Users\*Username*\AppData\Local\Programs\Python\Python37\lib\site-packages\pypolars\series.py", line 783, in cast
return wrap_s(f())
RuntimeError: Any(ArrowError(ComputeError("Casting from Int32 to Date64 not supported")))
- Conversion DateTime to String in Pandas, convert to PyPolars, converting String to DateTime in PyPolars
def changeDateTime(value):
return str(value)
a["ACTUAL_DROP_DATE"] = a["ACTUAL_DROP_DATE"].apply(changeDateTime)
a["ACTUAL_END_DATE"] = a["ACTUAL_END_DATE"].apply(changeDateTime)
b = pyp.from_pandas(a)
def changeStrBack(value):
if value == np.str("NaT"):
return ""
else:
year = int(value[0:4])
month = int(value[5:7])
day = int(value[8:10])
return pyp.datetime(year, month, day)
b["ACTUAL_DROP_DATE"] = b["ACTUAL_DROP_DATE"].apply(changeStrBack, dtype_out = pyp.Date32)
b["ACTUAL_END_DATE"] = b["ACTUAL_END_DATE"].apply(changeStrBack, dtype_out = pyp.Date32)
This has thrown me all the null values upon conversion. (i.e. both columns are completely null).
Hope anyone have some ideas on how I can get the columns to datetime in PyPolars.
Thank you!