1
from numpy import *
from matplotlib.pyplot import *
import pandas as pd

data = pd.read_csv('student-mat.csv', sep=';')
predict = 'Markup'
original = 'OriginalPrice'
y = np.array(data[predict])
x = np.array(data[original])


p1 = polyfit(x,y,1)
p2 = polyfit(x,y,2)
p3 = polyfit(x,y,3)
print(p1,p2,p3)

plot(x,y,'o')
plot(x,polyval(p1,x), 'r-')
plot(x,polyval(p2,x), 'b-')
plot(x,polyval(p3,x), 'm-')

show()

I am attempting to represent a set of data with a line of best fit, originally I used a polynomial but it seems I need a rational for this set of data. I'm not sure what function to use to generate a rational line of best fit model. Ideally, I would be able to simply replace my polyfit function with a rational one. Thanks in advance, any help is welcome :).enter image description here

1 Answers1

1

you could write your own function and minimize the error using least squares? for example...

for arbitrary exponential looking data, in variables X and Y:

def exp(args):
    a, b, c, d, e = args
    curve = [a*b**(c*x-d)+e for x in X]
    rmse = sum((y-pt)**2 for y,pt in zip(Y,curve))**0.5
    return rmse

fit = optimize.minimize(exp, [2, 2.8, -1, 0, 1]).x #initial guess

see random points + curve fit

points+fit

Derek Eden
  • 4,403
  • 3
  • 18
  • 31
  • Hello and sorry for the late comment, thank you for your help but it appears I have ran into another issue. I wrote my own function which helped a lot but the function isnt perfect. Im exploring the option of using least squares but Im also seeing if its possible to simply write the exact function that was used. The main issue is no matter what values I find the function I made will never be the exact or at least near exact function used. Its hard to explain in this comment so if you want to help here is my post https://stackoverflow.com/q/63064045/13484672. Thank you for all your help so far. – Andrew Kaplan Jul 25 '20 at 06:10