This is a follow-up of this question. I have several data sets of sample points sharing the same x-coordinates and would now like to do a polynomial fit taking all this sample points into account. That means that I want to end up with one set of parameters that describes the data best.
I figured out how to pass several data sets (in my example below there are only 2) to the fitting function, however, I then obtain one parameter set per data set.
How do I obtain only one set of parameters that describes all my data sets best?
Here is my code and the output I am getting:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import Ridge
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
x = np.array([0., 4., 9., 12., 16., 20., 24., 27.])
y = np.array([[2.9, 4.3, 66.7, 91.4, 109.2, 114.8, 135.5, 134.2],
[0.9, 17.3, 69.7, 81.4, 119.2, 124.8, 155.5, 144.2]])
y = y.T
# plt.plot(x,y[:, 0], 'ro', x,y[:,1],'bo')
# plt.show()
x_plot = np.linspace(0, max(x), 100)
X = x[:, np.newaxis]
X_plot = x_plot[:, np.newaxis]
plt.scatter(x, y[:, 0], label="training points 1", c='r')
plt.scatter(x, y[:, 1], label="training points 2", c='b')
for degree in np.arange(4, 5, 1):
model = make_pipeline(PolynomialFeatures(degree), Ridge(alpha=3, fit_intercept=False))
model.fit(X, y)
y_plot = model.predict(X_plot)
plt.plot(x_plot, y_plot, label="degree %d" % degree)
plt.legend(loc='lower left')
plt.show()
ridge = model.named_steps['ridge']
print(ridge.coef_)
As you can see, I get one curve per data set:
as well as two parameter sets:
[[ -4.09943033e-01 -1.86960613e+00 1.73923722e+00 -1.01704665e-01
1.73567123e-03]
[ 4.19862603e-01 2.18343362e+00 8.37222298e-01 -4.18711046e-02
5.69089912e-04]]
PS.: If the tool I am using is not the best suited one, I am also happy to get recommendations what I should use instead.