Are there different ways of calculating y = mx + b in python?

Question

I'm trying to find y = mx + b for a variety of different datasets. I've tried using:

slope_1, intercept_1 = linregress(values_1)

where values_1 is a Series type data.

bin_1	values
5th_per	10
25th_per	24
50th_per	28
75th_per	34
90th_per	50
95th_per	65

However, whenever I try to run the code, I get the error, IndexError: tuple index out of range.

I sort of understand the error, but am not sure how to fix this. Any suggestions on how to go about this or any other ways of finding the linear regression?

I'm assuming `values` are the y values, and the indices are the x values? — AJH, Apr 08 '22 at 18:02
Yes, sorry, didn't make that clear initially. Is it possible to find a linear regression with string values? Or would I need to change those to actual number values in order to make it work? — matrix_season, Apr 08 '22 at 18:07
Now I'm confused -- are you saying when values = 10, x = 5? And values = 24 corresponds to x = 25? Are your x values the row indices or the bin1 column? — AJH, Apr 08 '22 at 18:10
Also, yes, any sort of regression is going to need numbers, not strings. — AJH, Apr 08 '22 at 18:29

AJH · Accepted Answer · 2022-04-08T18:27:44.907

Assuming values are the y values and the indices are the x values (e.g. values = 10 has x = 0, values = 24 has x = 1, etc.), then you can do:

values_1 = df.tolist("values")

# Convert values_1 to a numpy array and get x values as indices.
y = np.array(values_1, dtype=float)
x = np.arange(0, len(y), 1)

soln = np.polyfit(x, y, 1)
m, b = soln[0], soln[1]

Let me know if you have any questions.

EDIT: If you want to use the bin_1 values for the x values, replace the line x = np.arange in the code above with the following:

# Split each string according to _, creating a list for each string with the
# 1st element containing the number and the 2nd element being "per".
bins = df["bin_1"].str.split("_")

# Get the 1st element in each list using row[0].
# Then access just the number by discarding the ending "th" in each row[0].
bins = [row[0][:-2] for row in bins]
x = np.array(bins, dtype=float)

Are there different ways of calculating y = mx + b in python?

1 Answers1