4

I am trying using multivariate regression to play basketball. Specificlly, I need to, based on X, Y, and distance from the target, predict the pitch, yaw, and cannon strength. I was thinking of using multivariate regression with multipule variables for each of the output parameter. Is there a better way to do this?

Also, should I use solve directly for the best fit, or use gradient descent?

technillogue
  • 1,482
  • 3
  • 16
  • 27

2 Answers2

2

ElKamina's answer is correct but one thing to note about this is that it is identical to doing k independent ordinary least squares regressions. That is, the same as doing a separate linear regression from X to pitch, from X to yaw, and from X to strength. This means, you are not taking advantage of correlations between the output variables. This may be fine for your application, but one alternative that does take advantage of correlations in the output is reduced rank regression(a matlab implementation here), or somewhat related, you can explicitly uncorrelate y by projecting it onto its principle components (see PCA, also called PCA whitening in this case since you aren't reducing the dimensionality).

I highly recommend chapter 6 of Izenman's textbook "Modern Multivariate Statistical Techniques: Regression, Classification, and Manifold Learning" for a fairly high level overview of these techniques. If you're at a University it may be available online through your library.

If those alternatives don't perform well, there are many sophisticated non-linear regression methods that have multiple output versions (although most software packages don't have the multivariate modifications) such as support vector regression, Gaussian process regression, decision tree regression, or even neural networks.

Jeshua
  • 96
  • 3
  • While this is a fairly interesting and in-depth answer, what I needed it for (FRC Montreal) was some two weeks ago, and we didn't have time to implement the robot code anyway. Thanks for the answer though! – technillogue Apr 01 '12 at 02:00
1

Multivariate regression is equivalent to doing the inverse of the covariance of the input variable set. Since there are many solutions to inverting the matrix (if the dimensionality is not very high. Thousand should be okay), you should go directly for the best fit instead of gradient descent.

n be the number of samples, m be the number of input variables and k be the number of output variables.

X be the input data (n,m)
Y be the target data (n,k)
A be the coefficients you want to estimate (m,k)

XA = Y
X'XA=X'Y
A = inverse(X'X)X'Y

X' is the transpose of X.

As you can see, once you find the inverse of X'X you can calculate the coefficients for any number of output variables with just a couple of matrix multiplications.

Use any simple math tools to solve this (MATLAB/R/Python..).

ElKamina
  • 7,747
  • 28
  • 43
  • Can you explain " inverse of the covariance of the input variable set " a bit better? – technillogue Mar 14 '12 at 23:02
  • @GLycan Add an extra column of 1s to the X. That way you can get the constant (b in y=ax+b). – ElKamina Mar 15 '12 at 05:19
  • The solution I currently have is http://pastie.org/3598312 , which loops through degrees including 0, so I don't need to do that. On the downside, I get a bit of redundancy. – technillogue Mar 15 '12 at 11:10