1

I found a similar question here but the solution did not work for me. Could someone help me understand what I'm doing wrong?

>>> df.dtypes

Name       object
Country    object
Product    object
Price      object
dtype: object

>>> df['Price'] = df['Price'].astype(str).astype(int)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\smuf2\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\generic.py", line 5815, in astype
    ```new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)```
  File "C:\Users\smuf2\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\internals\managers.py", line 418, in astype
    return ```self.apply("astype", dtype=dtype, copy=copy, errors=errors)```
  File "C:\Users\smuf2\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\internals\managers.py", line 327, in apply
    ```applied = getattr(b, f)(**kwargs)```
  File "C:\Users\smuf2\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\internals\blocks.py", line 592, in astype
    ```new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)```
  File "C:\Users\smuf2\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\dtypes\cast.py", line 1309, in astype_array_safe
    ```new_values = astype_array(values, dtype, copy=copy)```
  File "C:\Users\smuf2\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\dtypes\cast.py", line 1257, in astype_array
    ```values = astype_nansafe(values, dtype, copy=copy)```
  File "C:\Users\smuf2\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\dtypes\cast.py", line 1174, in astype_nansafe
    return ```lib.astype_intsafe(arr, dtype)```
  File "pandas\_libs\lib.pyx", line 679, in pandas._libs.lib.astype_intsafe
ValueError: invalid literal for int() with base 10: '1,200'

I'm new to python and have no idea what any of that means. Would really appreciate some help.

Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
SQLrookie
  • 11
  • 5

1 Answers1

1

Your column contains strings with , as a digit grouping symbol. You can replace it with nothing (get rid of it), or an underscore:

df['Price'].str.replace(',', '').astype(int)
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
  • @SQLrookie. Glad it worked out. Now that school has started, it's refreshing to see a question that (a) has enough information for a clear-cut answer, and (b) isn't just "gimme teh codez" – Mad Physicist Sep 10 '21 at 17:54