I have the following dataset:
import pandas as pd
import time
the_df = pd.DataFrame(
{
"id": [21, 22, 23, 24, 15, 26, 27, 28, 29, 30],
"amount": [10, 30, 12, 60, 0.5, 0.2, np.nan, 5, np.nan, 3],
"date": [
'2022-01-01','2022-08-01',None,'2022-12-01','2022-02-01',
None,'2022-02-01','2022-01-01','2022-11-01','2022-01-01'
]
}
)
the_df
id amount date
0 21 10.0 2022-01-01
1 22 30.0 2022-08-01
2 23 12.0 None
3 24 60.0 2022-12-01
4 15 0.5 2022-02-01
5 26 0.2 None
6 27 NaN 2022-02-01
7 28 5.0 2022-01-01
8 29 NaN 2022-11-01
9 30 3.0 2022-01-01
And I want to convert the column date
into a float
. My approach was to convert the date
column into a pandas datetime
, and then convert column into a float
using a lambda function. My problem was to evaluate a NaT
to avoid the conversion in case is NaT
. This is what I tried:
the_df['date'] = pd.to_datetime(the_df['date'])
the_df['date'] = the_df['date'].\
apply(lambda x: float(time.mktime(x.timetuple())) if x.notnull() else x)
But I get this error:
AttributeError: 'Timestamp' object has no attribute 'notnull'
I tried this second option:
the_df['date'] = pd.to_datetime(the_df['date'])
the_df['date'] = the_df['date'].\
apply(lambda x: float(time.mktime(x.timetuple())) if x is not None else x)
But I get this error:
ValueError: NaTType does not support timetuple
Please, could you point what I am doing wrong? Why if the value is not evaluated when is None
, it seems to be evaluated as timetuple
?