0

I've a Regression model that is most suitably solved using elastic net. It has a very large number of predictors that I need to select only subset of them. Moreover, there could be correlation between the predictors, so Elastic net was the choice)

My question is: If I have knowledge that a specific subset of the predictors must exist in the output (they shouldn't be penalized), how can this information be added to the elastic net? Or even to the Regression model if elastic net is suitable in this case.

I need advise about papers that propose such solutions if possible.

I'm using Scikit-learn in Python, but I'm concerned more about the algorithm more than just how to do it.

Doaa
  • 436
  • 4
  • 10

2 Answers2

1

If you're using the glmnet package in R, the penalty.factor argument addresses this.

From ?glmnet:

penalty.factor

Separate penalty factors can be applied to each coefficient. This is a number that multiplies lambda to allow differential shrinkage. Can be 0 for some variables, which implies no shrinkage, and that variable is always included in the model. Default is 1 for all variables (and implicitly infinity for variables listed in exclude). Note: the penalty factors are internally rescaled to sum to nvars, and the lambda sequence will reflect this change.

Community
  • 1
  • 1
Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • I'm not using R but this is useful to know. So the factor is multiplied by both L1 and L2 penalties? Do u know what the algorithm is called when it has different factors? – Doaa Jul 23 '15 at 01:40
0

It depends on the kind of knowledge that you have. Regularization is a kind of adding prior knowledge to your model. For example Ridge regression encodes the knowledge that your coefficients should be small. Lasso regression encodes the knowledge that not all predictors are important. Elastic net is a more complicated prior that combine both of the assumptions in your model. There are other regularizers that you may check for example if you know that your predictors are grouped in certain groups you may check grouped Lasso. Also, if they interact in certain way (maybe some predictors are correlated with each other). You may also check Bayesian regression if you need more control over your prior.

  • The prior knowledge that I've is that a "subset of the predictors should exist" in the output, but their coefficients are unknowns. The rest of the predictors may or may not be selected as predictors. – Doaa Jul 23 '15 at 19:26