-3

I have a string series with some nan, and I want to replace some characters and then turn it into int(float is ok) but nan still remain nan. Like

In[1]:df = pd.DataFrame(["type 12", None, "type13"], columns=['A'])
Out[1]: 
  A
0 12
1 NaN
2 13

Is there any good way to do it?

modkzs
  • 1,369
  • 4
  • 13
  • 17
  • Can you provide a code snippet? – zarak Aug 17 '16 at 07:08
  • Please check [How to make good reproducible pandas examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and add desired output too. Thanks. – jezrael Aug 17 '16 at 07:20
  • Possible duplicate of [NumPy or Pandas: Keeping array type as integer while having a NaN value](http://stackoverflow.com/questions/11548005/numpy-or-pandas-keeping-array-type-as-integer-while-having-a-nan-value) – Ami Tavory Aug 17 '16 at 08:00
  • Already update question – modkzs Aug 17 '16 at 10:35

1 Answers1

1

No, unfortunately. You will have to settle for floats.

>>> s = pd.Series(['1', '2', '3', '4', '5'], index=list('abcde'))
>>> s
a    1
b    2
c    3
d    4
e    5
dtype: object
>>> s = s.reindex(['a','b','c','f','u'])
>>> s
a      1
b      2
c      3
f    NaN
u    NaN
dtype: object
>>> s.astype(int)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/generic.py", line 2947, in astype
    raise_on_error=raise_on_error, **kwargs)
  File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py", line 2873, in astype
    return self.apply('astype', dtype=dtype, **kwargs)
  File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py", line 2832, in apply
    applied = getattr(b, f)(**kwargs)
  File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py", line 422, in astype
    values=values, **kwargs)
  File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/internals.py", line 465, in _astype
    values = com._astype_nansafe(values.ravel(), dtype, copy=True)
  File "/home/juan/anaconda3/lib/python3.5/site-packages/pandas/core/common.py", line 2628, in _astype_nansafe
    return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
  File "pandas/lib.pyx", line 937, in pandas.lib.astype_intsafe (pandas/lib.c:16620)
  File "pandas/src/util.pxd", line 60, in util.set_value_at (pandas/lib.c:67979)
ValueError: cannot convert float NaN to integer

From Pandas Caveats and Gotchas:

The special value NaN (Not-A-Number) is used everywhere as the NA value, and there are API functions isnull and notnull which can be used across the dtypes to detect NA values.

However, it comes with it a couple of trade-offs which I most certainly have not ignored... In the absence of high performance NA support being built into NumPy from the ground up, the primary casualty is the ability to represent NAs in integer arrays.

So work with this:

>>> s.astype(float)
a    1.0
b    2.0
c    3.0
f    NaN
u    NaN
dtype: float64
juanpa.arrivillaga
  • 88,713
  • 10
  • 131
  • 172