-1

As decision trees are non linear models so Random Forest should also be nonlinear methods in my opinion. But at some articles i have read otherwise. Can anyone explain how are they nonlinear or not .

or in other words Is Random Forest for linear or non linear data .

If i have a variable A (dependent) and other independent variables B and C and so on . How would RF fit a regression on these variables in the data.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
vaibhav
  • 158
  • 1
  • 12
  • I think the question does not make sense - it's a model by it self, and not a linear regression model. So I don't think I understand the question – CutePoison Aug 10 '21 at 07:09
  • 2
    I’m voting to close this question because it is not about programming as defined in the [help] but about ML theory and/or methodology - please see the intro and NOTE in https://stackoverflow.com/tags/machine-learning/info – desertnaut Oct 28 '21 at 15:42
  • Claiming that RF is a *linear* model is absurd; care to share the exact sources you say you have read so? – desertnaut Oct 28 '21 at 15:59

1 Answers1

2

What RF does is to devide your data into square boxes. When you then get a new datapoint it follows the yes/no-answers and ends up in a box.

In classification, it counts how many of each class thats in each box, and the majority of the classes is the prediciton.

When doing regression, it takes the mean of the values in each box.

In a regression setting you have the following equation

y = b0 + x1*b1 + x2*b2 +.. + xn*bn

where xi is your feature "i" and bi is the coefficient to xi.

A linear regression is linear in the coefficients but say we have the following regression

y=x0 +x1*b1 + x2*cos(b2)

that is not a linear regression since it is not linear in the coefficient b2.

To check if it is linear then the derivative of y with respect to bi should be independent of bi for all bi, i.e take the first example (the linear one):

dy/db1 = x1

which is independent of b1 (this give the same answer for all dy/dbi) but the second example

# y=x0 +x1*b1 + x2*cos(b2)
dy/db2 = x2*(-sin(b2))

which is not independent of b2 thus not a linear regression.

As you can see RF and linear regression is two different things and the linearity of a regression has nothing to do with a RF (or the other way round that matter)

CutePoison
  • 4,679
  • 5
  • 28
  • 63