1

If I have currency column in pandas dataframe in dtype:object (text) which has values like this:

34500 USD
34222 USD

how do I convert it into integer type that allows NaN or NA to appear in column?

Hrvoje
  • 13,566
  • 7
  • 90
  • 104

2 Answers2

2

We can do str.strip

carDF['carPrice'] = pd.to_numeric(carDF['carPrice'].str.strip('USD'), errors='coerce', downcast='integer')
BENY
  • 317,841
  • 20
  • 164
  • 234
1

Solution:

carDF['carPrice'] = carDF['carPrice'].astype(str).str.replace(' USD','')
carDF['carPrice'] = pd.to_numeric(carDF['carPrice'], errors='coerce', downcast='integer').astype('Int64')

and if you have non ASCII space in your character which you can see only if you print single row like:

carDF['carPrice'][0]
'34500 \xa0USD'

than you have to use:

carDF['carPrice'] = carDF['carPrice'].astype(str).str.replace(u'\xa0USD', '')

as explained here

You must run pandas version above 0.24

pip install pandas --upgrade

to upgrade to latest pandas if doesn't work for you.

Hrvoje
  • 13,566
  • 7
  • 90
  • 104