What RF does is to devide your data into square boxes.
When you then get a new datapoint it follows the yes/no-answers and ends up in a box.
In classification, it counts how many of each class thats in each box, and the majority of the classes is the prediciton.
When doing regression, it takes the mean of the values in each box.
In a regression setting you have the following equation
y = b0 + x1*b1 + x2*b2 +.. + xn*bn
where xi
is your feature "i" and bi
is the coefficient to xi
.
A linear regression is linear in the coefficients but say we have the following regression
y=x0 +x1*b1 + x2*cos(b2)
that is not a linear regression since it is not linear in the coefficient b2
.
To check if it is linear then the derivative of y
with respect to bi
should be independent of bi
for all bi
, i.e take the first example (the linear one):
dy/db1 = x1
which is independent of b1
(this give the same answer for all dy/dbi
) but the second example
# y=x0 +x1*b1 + x2*cos(b2)
dy/db2 = x2*(-sin(b2))
which is not independent of b2
thus not a linear regression.
As you can see RF and linear regression is two different things and the linearity of a regression has nothing to do with a RF (or the other way round that matter)