If I have currency column in pandas dataframe in dtype:object (text) which has values like this:
34500 USD
34222 USD
how do I convert it into integer type that allows NaN or NA to appear in column?
We can do str.strip
carDF['carPrice'] = pd.to_numeric(carDF['carPrice'].str.strip('USD'), errors='coerce', downcast='integer')
Solution:
carDF['carPrice'] = carDF['carPrice'].astype(str).str.replace(' USD','')
carDF['carPrice'] = pd.to_numeric(carDF['carPrice'], errors='coerce', downcast='integer').astype('Int64')
and if you have non ASCII space in your character which you can see only if you print single row like:
carDF['carPrice'][0]
'34500 \xa0USD'
than you have to use:
carDF['carPrice'] = carDF['carPrice'].astype(str).str.replace(u'\xa0USD', '')
as explained here
You must run pandas version above 0.24
pip install pandas --upgrade
to upgrade to latest pandas if doesn't work for you.