Weka: How can I implement a Surrogate Split in J48 Decision Tree?

Question

Can anybody help me to implement an alternative missing value handling in J48 algorithm using Weka API in Java.

I am sure that using pre-imputation approaches before training the J48 is easy.

But what is about using a surrogate split attribute in case of partition the training date (like Breiman does in CART) instead of the J48 standard approach (Quinlan in C4.5) splitting the cases across a probability distribution from observed cases with known value.

Can anybody give me some information, tip, help, where in the Weka API and Source Code a have to modify to replace standard with surrogate split?

score 1 · Accepted Answer · answered Jul 08 '14 at 17:52

1

Look at weka source code weka.classifiers.trees.j48.C45ModelSelection from line 152 (Find "best" attribute to split on). It uses info gain ratio as splitting criteria.

answered Jul 08 '14 at 17:52

doxav

978
8
14

Weka: How can I implement a Surrogate Split in J48 Decision Tree?

1 Answers1