-2

I'm trying to find y = mx + b for a variety of different datasets. I've tried using:

slope_1, intercept_1 = linregress(values_1)

where values_1 is a Series type data.

bin_1 values
5th_per 10
25th_per 24
50th_per 28
75th_per 34
90th_per 50
95th_per 65

However, whenever I try to run the code, I get the error, IndexError: tuple index out of range.

I sort of understand the error, but am not sure how to fix this. Any suggestions on how to go about this or any other ways of finding the linear regression?

  • I'm assuming `values` are the y values, and the indices are the x values? – AJH Apr 08 '22 at 18:02
  • Yes, sorry, didn't make that clear initially. Is it possible to find a linear regression with string values? Or would I need to change those to actual number values in order to make it work? – matrix_season Apr 08 '22 at 18:07
  • Now I'm confused -- are you saying when values = 10, x = 5? And values = 24 corresponds to x = 25? Are your x values the row indices or the bin1 column? – AJH Apr 08 '22 at 18:10
  • Also, yes, any sort of regression is going to need numbers, not strings. – AJH Apr 08 '22 at 18:29

1 Answers1

2

Assuming values are the y values and the indices are the x values (e.g. values = 10 has x = 0, values = 24 has x = 1, etc.), then you can do:

values_1 = df.tolist("values")

# Convert values_1 to a numpy array and get x values as indices.
y = np.array(values_1, dtype=float)
x = np.arange(0, len(y), 1)

soln = np.polyfit(x, y, 1)
m, b = soln[0], soln[1]

Let me know if you have any questions.

EDIT: If you want to use the bin_1 values for the x values, replace the line x = np.arange in the code above with the following:

# Split each string according to _, creating a list for each string with the
# 1st element containing the number and the 2nd element being "per".
bins = df["bin_1"].str.split("_")

# Get the 1st element in each list using row[0].
# Then access just the number by discarding the ending "th" in each row[0].
bins = [row[0][:-2] for row in bins]
x = np.array(bins, dtype=float)
AJH
  • 799
  • 1
  • 3
  • 16