I have a data frame column as float64 full of NaN
values, If I cast it again to float64 they got substituted for <NA>
values which are not the same.
I know that the <NA>
values are pd.NA
, while NaN
values are np.nan
, so they are different things. So why casting an already float64 column to float64 changed NaN
to <Na>
?
Here's an example:
df=pd.DataFrame({'a':[1.0,2.0]})
print(df.dtypes)
#output is: float64
df['a'] = np.nan
print(df.dtypes)
# output is float64
print(df)
a
0 NaN
1 NaN
#Now, lets cast that float64 to float 64
df3['a']=df3['a'].astype(pd.Float64DType())
print(df3.dtypes)
#output is Float64, notice it's uppercase F this time, previously it was lowercase
print(df3)
a
0 <NA>
1 <NA>
it seems float64
and Float64
are two different things. And NaN
(np.nan) is the null value for float64
while <NA>
(pd.NA) is the null for Float64
Is this correct? And if so, what's under the hoods?