I am trying to convert dataframes to a certain datatype.
My data looks like this initially
userToAppend333.head()
Output
UserID Rating GoodreadsID
484969 1397324 0 13617
484970 1397342 5 105576
484971 1397342 4 3320520
484972 1397342 4 865
484973 1397342 3 105578
I am trying to execute this operation
userToAppend333 = userToAppend333.astype(np.int32)
But I get this error
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-10-c6f5e3c74de7> in <module>()
----> 1 userToAppend333 = userToAppend333.astype(np.int32)
5 frames
/usr/local/lib/python3.6/dist-packages/pandas/core/generic.py in astype(self, dtype, copy, errors, **kwargs)
5689 # else, only a single dtype is given
5690 new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors,
-> 5691 **kwargs)
5692 return self._constructor(new_data).__finalize__(self)
5693
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in astype(self, dtype, **kwargs)
529
530 def astype(self, dtype, **kwargs):
--> 531 return self.apply('astype', dtype=dtype, **kwargs)
532
533 def convert(self, **kwargs):
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in apply(self, f, axes, filter, do_integrity_check, consolidate, **kwargs)
393 copy=align_copy)
394
--> 395 applied = getattr(b, f)(**kwargs)
396 result_blocks = _extend_blocks(applied, result_blocks)
397
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/blocks.py in astype(self, dtype, copy, errors, values, **kwargs)
532 def astype(self, dtype, copy=False, errors='raise', values=None, **kwargs):
533 return self._astype(dtype, copy=copy, errors=errors, values=values,
--> 534 **kwargs)
535
536 def _astype(self, dtype, copy=False, errors='raise', values=None,
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/blocks.py in _astype(self, dtype, copy, errors, values, **kwargs)
631
632 # _astype_nansafe works fine with 1-d only
--> 633 values = astype_nansafe(values.ravel(), dtype, copy=True)
634
635 # TODO(extension)
/usr/local/lib/python3.6/dist-packages/pandas/core/dtypes/cast.py in astype_nansafe(arr, dtype, copy, skipna)
681 # work around NumPy brokenness, #1987
682 if np.issubdtype(dtype.type, np.integer):
--> 683 return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
684
685 # if we have a datetime/timedelta array of objects
pandas/_libs/lib.pyx in pandas._libs.lib.astype_intsafe()
ValueError: invalid literal for int() with base 10: 'UserID'
From my understanding of this error, there is some value in the column 'UserID' which can't be converted to an np.int32 datatype. So I am trying to look what these values are, but the dataframe is thousands of rows long, so it's now easy to locate the rows with the problematic values.
Is there a method to locate where exactly the error occurs for a data conversion error?