6

I am trying to plot a trendline for my data. However, I am getting the error

ValueError: data type <class 'numpy.object_'> not inexact.  

Can someone explain why?

my dataframe is Us_corr3;

enter image description here
Here is my code:

data5 = Us_corr3[['US GDP', 'US Unemployment']]

x = data5['US GDP']

y = data5['US Unemployment']

plt.scatter(x, y)


z = np.polyfit(x, y, 1)

p = np.poly1d(z)

plt.plot(x,p(x),"r--")

plt.show()

And it says;

ValueError: data type <class 'numpy.object_'> not inexact.
Zephyr
  • 11,891
  • 53
  • 45
  • 80
  • Give us more information! `data5.dtypes`, the full error `traceback`, `x` is a `Series`. What about `x.to_numpy()`? `shape`, `dtype`? – hpaulj Jul 19 '20 at 17:42

2 Answers2

4

If x, the array derived from your Series is object dtype, it produces your error:

In [67]: np.polyfit(np.arange(3).astype(object),np.arange(3),1)                                      
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-67-787351a47e03> in <module>
----> 1 np.polyfit(np.arange(3).astype(object),np.arange(3),1)

<__array_function__ internals> in polyfit(*args, **kwargs)

/usr/local/lib/python3.6/dist-packages/numpy/lib/polynomial.py in polyfit(x, y, deg, rcond, full, w, cov)
    605     # set rcond
    606     if rcond is None:
--> 607         rcond = len(x)*finfo(x.dtype).eps
    608 
    609     # set up least squares equation for powers of x

/usr/local/lib/python3.6/dist-packages/numpy/core/getlimits.py in __new__(cls, dtype)
    380             dtype = newdtype
    381         if not issubclass(dtype, numeric.inexact):
--> 382             raise ValueError("data type %r not inexact" % (dtype))
    383         obj = cls._finfo_cache.get(dtype, None)
    384         if obj is not None:

ValueError: data type <class 'numpy.object_'> not inexact

Functions like this expect numeric dtype arrays. Cleanup your dataframe first!

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • 2
    After your response, I converted the data from object to float. I used " x = data5['US GDP'].astype(str).astype(float)" for both x and y. And it worked. Thank you so much – Yunus Emre Kocabey Jul 19 '20 at 18:04
  • My data was formatted as `Int32` instead of `int` - thanks for the tip. – Nate Apr 30 '21 at 18:01
3

you can use x = list(x) to convert the data from numpy.object to float

data5 = Us_corr3[['US GDP', 'US Unemployment']]
x = list(data5['US GDP'])
y = list(data5['US Unemployment'])

plt.scatter(x, y)
z = np.polyfit(x, y, 1)

p = np.poly1d(z)

plt.plot(x,p(x),"r--")

plt.show()
daf utbhg
  • 31
  • 1