scikit learn: polynomial interpolation of higher dimensions

Question

In this link of the scikit-learn project: http://scikit-learn.org/stable/auto_examples/linear_model/plot_polynomial_interpolation.html, it is shown how to apply polynomial interpolation to approximate a certain function for example.

This example is set for 2D-points. However:

How can it be extended to fit for 3D-points?
Or is this even possible with scikit-learn? In the documentation I couldn't find any hints yet.

Thank you in advance for any information and with best regards.

Dan

Edit 1:

Thank you Robin for your answers! Also pointing out at the rapid growth of complexity was a valuable hint!

I've stumbled on one problem so far, which is related to the 2D-array X in model.fit(X,z)

The 2D-array looks like that:

[[ 0.1010101   0.35353535]
 [ 0.4040404   0.65656566]
 [ 0.80808081  1.11111111]
 [ 1.21212121  1.31313131]]

while the function z is those of an paraboloid:

(((x**2)/(4**2) + ((y**2)/(8**2))) * 2)

Running model.fit(X,z) returns the following error message:

ValueError: Found arrays with inconsistent numbers of samples: [10 20]

From where does the inconsistency results?

Yes, just try to do `model = make_pipeline(PolynomialFeatures(degree), Ridge())` then `model.fit(X, y)` with X having more than two columns (and `degree` being the degree you want your interpolation to be). While the visualization might become trickier, predicting the target values should work just fine. Tell us if there are any specific problems. — Robin Spiess, Feb 16 '16 at 11:58
To add some background info: The [Polynomial Features](http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html#sklearn.preprocessing.PolynomialFeatures) preprocessing step just creates all possible combinations of your features which means even for a 2d input and degree 2 the feature space is already 6 dimensional (`1, a, b, a*b, a*a, b*b`). With more features this number just grows even more rapidly. — Robin Spiess, Feb 16 '16 at 12:03
@RobinSpiess is there anything to taken care of while creating an array X with more than two columns? As it can be seen in my edited post, I get an error message which shows strange dimensions [10, 20]. Thank you in advance for any hints! — Daniyal, Feb 17 '16 at 23:07
Hm, could you post the output of `print(X.shape)` and `print(z.shape)`? (Assuming X and z are numpy arrays, otherwise use `print(np.array(X).shape)`) I just tried it in a small example and for me it works even if each row in X has an additional element. — Robin Spiess, Feb 18 '16 at 07:06
Sure! :) print(X.shape) provides: (10, 2) z is in my case the function z = f(x,y) , which in turn looks like as described in my original post above. It is the function for a paraboloid. It may be evtl. interesting how I created X: First I created a 1D-numpy array: x = np.linspace(0, 5, num=100), after that I applied some shuffling and finally I typed X = np.reshape(x, (-1, 2)) in order to make an array with two columns. Applying on this array print(X.shape) I got the previously mentioned result of (10, 2) Thank you for your efforts! — Daniyal, Feb 18 '16 at 08:20
So what happens if you create an array `y = z(X[:,0],X[:,1])`and then use `model.fit(X,y)`? (I'm also not sure how you go from the 100 linspace elements to the 20 elements in X, but that's not important) — Robin Spiess, Feb 18 '16 at 08:53
Ahhh, of course! It is required to hand over the specific x,y values to the function. Excellent, what you wrote just did the trick! Thank you! — Daniyal, Feb 18 '16 at 09:33

score 1 · Accepted Answer · answered Feb 18 '16 at 12:14

Yes, the same approach can be used for higher dimensional data. Just use the same code with an X containing more columns.

# For some degree, X and y
model = make_pipeline(PolynomialFeatures(degree), Ridge())
model.fit(X, y)

To add some background info: The Polynomial Features preprocessing step just creates all possible combinations of your features. This means even for a 2d input and degree 2 the feature space is already 6 dimensional (1, a, b, a*b, a*a, b*b). With more features this number grows even more rapidly.

For your second question, the fit function only accepts vectors and not functions. Therefore create a vector y = z(X[:,0],X[:,1]) and use this in the fit function instead model.fit(X,y).

scikit learn: polynomial interpolation of higher dimensions

1 Answers1