Doesn't introduction of polynomial features lead to increased collinearity?

Question

I was going through Linear and Logistic regression from ISLR and in both cases I found that one of the approaches adopted to increase the flexibility of the model was to use polynomial features - X and X^2 both as features and then apply the regression models as usual while considering X and X^2 as independent features (in sklearn, not the polynomial fit of statsmodel). Does that not increase the collinearity amongst the features though? How does it affect the model performance?

To summarize my thoughts regarding this -

First, X and X^2 have substantial correlation no doubt.

Second, I wrote a blog demonstrating that, at least in Linear regression, collinearity amongst features does not affect the model fit score though it makes the model less interpretable by increasing coefficient uncertainty.

So does the second point have anything to do with this, given that model performance is measured by the fit score.

How strongly power polynomials are correlated or cause numerical problems also depends on the scale of the variable. If the underlying variable has large values, then the condition number increases and numerical precision is reduced. — Josef, Jun 10 '21 at 05:43

score 2 · Accepted Answer · answered Jun 10 '21 at 04:30

Multi-collinearity isn't always a hindrance. It depends from data to data. If your model isn't giving you the best results(high accuracy or low loss), you then remove the outliers or highly correlated features to improve it but is everything is hunky-dory, you don't bother about them.

Same goes with polynomial regression. Yes it adds multi-collinearity in your model by introducing x^2, x^3 features into your model.

To overcome that, you can use orthogonal polynomial regression which introduces polynomials that are orthogonal to each other.

But it will still introduce higher degree polynomials which can become unstable at the boundaries of your data space.

To overcome this issue, you can use Regression Splines in which it divides the distribution of the data into separate portions and fit linear or low degree polynomial functions on each of these portions. The points where the division occurs are called Knots. Functions which we can use for modelling each piece/bin are known as Piecewise functions. This function has a constraint , suppose, if it is introducing 3 degree of polynomials or cubic features and then the function should be second-order differentiable. Such a piecewise polynomial of degree m with m-1 continuous derivatives is called a Spline.

Doesn't introduction of polynomial features lead to increased collinearity?

1 Answers1