R predictive model reason for predicitions and propensity %

Question

very new to R and machine learning however I'm having to undertake a project to predict customer churn based on a number of variables e.e. length of service, number of credit notes issued, number of missed deliveries, number of price increases etc.

I'm using rpart and randomforest and have got a dataset with a churn prediction against each one. I am able to produce a confidence matrix and to see which are the important indicators. However, the aim with the output is to send to the Sales team as an 'at risk' list of customers to deal with.

What would be really important for this is one to append the confidence/propensity/liklihood % to churn so I can rank in order of risk but also, is there a way to append a category/summary/reason for each customer as to why they were predicted to churn - i.e. customer abc - high score on price increases so we need to be careful with pricing,. customer def - high on missed deliveries - need to fix our service?

Many thanks for your help.

score 1 · Answer 1 · answered Sep 21 '16 at 08:52

If you want to predict the probability of churn, you can train a logistic regression model and predict the churn probability with the model. You can also find out the significant predictor variables that are causing customer churn (refer to http://www.duplication.net.au/ANZMAC09/papers/ANZMAC2009-678.pdf), you can use anova along with it to find the variance explained by the significant predectors.
If you want to find a reason why a particular customer churn happened, you can learn a decision tree (CART / rpart) model and then follow the path from the root to the leaf node that the customer belong to in the decision tree learnt.
Finally randomForest ensemble classifier can be used to find the most important predictors for churn in terms of OOB error estimates.

THanks for that, makes sense. However point 2 sounds very manual - if I have 100k customers, would I need to trace the variables down the tree for each one? Is there no way to identify the root for each customer? Thanks — user3103335, Sep 21 '16 at 12:45

R predictive model reason for predicitions and propensity %

1 Answers1