2

Does anybody know how to implement regression in R where our goal is to minimize sum of squares of residuals subject to all residuals being non-negative and also with constraints on coefficients? Specifically I am asking about univariate regression with quadratic term where b_0 <=0, b_1>=0 and b_2>=0.

I was able to solve similar problem where the goal is to minimize sum of residuals using lpSolve package. Solving for sum of squares seems to be considerably harder in R. Any ideas?

Question was asked on Cross Validated too:

https://stats.stackexchange.com/questions/272234/restrictions-on-residuals-and-coefficients-in-regression-dfa

Community
  • 1
  • 1
Vivaldi
  • 138
  • 5

2 Answers2

1

Step 1. Formulate a model. This is done in this post. Although I think the complete model should look like:

enter image description here

Observe this is a pure quadratic programming problem. The decision variables are the beta0, beta1, beta2 and r(i). We have bounds on the decision variables, and linear equalities (remember y and x is data here). A standard QP model looks like:

enter image description here

Here the new x are the variables beta0,beta1,beta2 and r(i). The matrix A is formed by

A = [1 x(i) x(i)^2 I ]

where x(i) is data again. The last part is an identity submatrix. The Q matrix is a unit matrix (and zero for the variables corresponding to beta,beta1,beta2):

Q = [ 0  0 ]
    [ 0  I ]

Step 2. Solve with a QP solver. A solver like QuadProg, Gurobi or Cplex should have no problem with this.

Community
  • 1
  • 1
Erwin Kalvelagen
  • 15,677
  • 2
  • 14
  • 39
  • I tried to follow your logic but wasn't able to reproduce desired results (please see my edit on the question). – Vivaldi Apr 11 '17 at 15:03
  • The model specification was okay. Quantity predicted given the purchasing value is bigger than the actual quantity since we are constructing the efficiency frontier. Therefore we have to substract inefficiency from the predicted quantity to get to the actual quantity. So in your formulation residuals should be non-positive. – Vivaldi Apr 11 '17 at 15:09
0

Here is a code fragment that tries to implement the solution given above:

library(quadprog)

xvec <- c(8.1,6.4,6.8,13.3,0.7,2.4,3.5,6.5,1.9,2.8,8.0,6.8,4.6,18.6,1.1,9.5,1.4,3.8,0.7,11.5,7.1,8.2,7.0,7.0,2.2,9.8,0.3,2.5,10.6,1.4,31.0)
yvec <- c(15.8,10.6,12.8,26.5,1.3,3.9,6.2,13.1,3.1,4.4,12.6,11.6,9.3,35.3,1.8,16.0,2.7,6.4,1.3,18.9,12.0,14.3,13.5,11.3,3.5,16.0,0.6,4.8,17.7,2.5,71.0)

K <- length(xvec)

# decision vars: c(beta0, beta1, beta2, r1, ..., rK)

Amat <- cbind(rep(1,K), xvec, xvec^2, diag(rep(1, K)))
Amat <- rbind(Amat, c(-1, rep(0,K+2)), c(0,1, rep(0,K+1)), c(0,0,1, rep(0,K)))

bvec <- c(yvec, rep(0,3))

Dmat <- diag(rep(1, K+3))
Dmat[1:3, 1:3] <- diag(rep(0.000001, 3))

s <- solve.QP(Dmat=Dmat, dvec=rep(0, K+3), Amat=t(Amat), bvec=bvec, meq=K)

s$solution
Karsten W.
  • 17,826
  • 11
  • 69
  • 103