X and Y are not correlated, but Y is predictor of Xin random forests classifier. How to represent this using statistics and machine learning?

Question

X and Y are not correlated (0.3); however, when I place X in random forests classifier predicting Y, alongside two (A, B) other (related) variables, X and two other variables (A, B) are significant predictors of Y. Note that the two other (A, B) variables are also not correlated with Y. How can I interpret this according to statistics and machine learning idea.

Representing one or more variable (A, or B or Y) with respect to another variable (X), where the variables don't have a strong correlation.

This may be a better question for https://stats.stackexchange.com/ — Pace, Nov 11 '17 at 18:35

score 0 · Answer 1 · answered Nov 18 '17 at 16:50

Correlations are linear. If there’s a nonlinear relationship, you might see little or no correlation.

Random forests (and decision trees are nonlinear) so you could find a random forest predictive even if the correlation is zero.

A quadratic function could have zero correlation between X and Y

This image and more on correlations can be found at https://www.statisticalengineering.com/correlation.htm

X and Y are not correlated, but Y is predictor of Xin random forests classifier. How to represent this using statistics and machine learning?

1 Answers1