0

I have some data for time series prediction. variable 1 is speed and variable 2 is time of the day the vehicle is starting. The output is time taken for the vehicle to reach destination. I used both variable 1 and variable 2 as inputs for svr using libsvm but later found out that variable 1 and variable 2 are dependent since speed of the vehicle depends on time of the day.

Can we do regression using two dependent variables as inputs? As I know the regression model y = a + b1.x1 + b2.x2 + ....+ e is for independent variables.

halfer
  • 19,824
  • 17
  • 99
  • 186
ChanChow
  • 1,346
  • 7
  • 28
  • 57

1 Answers1

1

The standard regression model is not for independent inputs: no assumption is made about dependence between input variables. However, if there is an interaction effect, you might find that simply adding an interaction term into the regression model improves results: with this, your model becomes:

y = a + b1.x1 + b2.x2 + b2.x1.x2

I'm not sure what the state of SVR is, and whether you can put this option in directly; you can certainly fake it by adding that feature to the input, or use a regression method which directly supports it.

Another potential hazard is how you're representing time, as I can easily see this going wrong. What does your time input look like?

Ben Allison
  • 7,244
  • 1
  • 15
  • 24
  • A sample input and output is <8.5,40>,<70>. I normalize this data before I train. 8.5 is the hour of the day and 40 is the speed. 70 is the travel time taken. Could you please tell me if I am representing the time wrong. The time input column would be <8.5,9,9.5,10,10.5...18.5> and I have different velocities for each time interval. – ChanChow Nov 04 '13 at 15:12