1

I am using DecisionTreeClassifer for CHAID in sklearn and all of the trees I get split parent nodes into only 2 child nodes (where further splitting in possible)

Do you know whether it is possible to allow more splits (eg. 3 or more) at each point (as is the case with some of the commercial software packages – eg IBM SPSS)? Would I need to find another module within Python to achieve this?

DecisionTreeClassifer has a number of parameters, but I could not see one where you can vary the number of parent nodes for each child

DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=None,max_features=None, max_leaf_nodes=None, min_impurity_decrease=0.0, min_impurity_split=None, min_samples_leaf=1, min_samples_split=20, min_weight_fraction_leaf=0.0, presort=False, random_state=99, splitter='best')

Any pointers would be most welcome! Thanks

kenlukas
  • 3,616
  • 9
  • 25
  • 36
Andrew3193
  • 11
  • 1
  • 2
  • Thank you very much, Clemens - the answer to the previously-asked question answers my own exactly! I can appreciate that allowing more than 2 splits especially on numerical variables creates too many parameters too optimise. – Andrew3193 Aug 24 '19 at 19:12
  • Hi, as per the duplicate question and answer show it's not possible in sklearn but you might find this useful instead: https://github.com/Rambatino/CHAID – tezzaaa Aug 30 '19 at 08:33
  • Thank you very much. I will work through this link – Andrew3193 Aug 31 '19 at 09:33
  • Hi, one problem i'm having with this technique is visualisation.. .i cant seem to be able to draw proper diagrams from the outputs.... if you check my example here: https://stackoverflow.com/questions/57711320/how-to-draw-tree-diagram-from-chaid-tree-output – tezzaaa Sep 02 '19 at 15:01
  • Hi - Thanks for letting me know. I will need to look into this, but I too had a bit of a struggle getting proper graphics out (ie boxes containing text). With "DecisionTreeClassifier", I got an unreadable text-based output, which I could drop into http://webgraphviz.com/ – Andrew3193 Sep 03 '19 at 18:30
  • The example in the paper I was using at the time (I will try to find it!) provided some code to get nice charts out, but this did not work for me - so I used the converter at webgraphviz.com In any case "DecisionTreeClassifier" is CART / Gini-based. Now I went to the link you pointed me to (as per above on 8:33 on 30 Aug)- which is proper CHAID - "Tree.from_pandas_df" and got a different type of unreadable text, which I could not convert using webgraphviz.com. I will investigate further and many thanks! – Andrew3193 Sep 03 '19 at 18:41
  • Incidentally - does ln13 onwards of this example help you with getting proper graphics out? http://benalexkeen.com/decision-tree-classifier-in-python-using-scikit-learn/ – Andrew3193 Sep 03 '19 at 18:52
  • Hi, thanks for this but no it does not work.. if you check my example above it's hard to draw proper diagram from Chaid Trees, it's a shame as the package works perfect – tezzaaa Sep 04 '19 at 10:17
  • Yes - to be honest, I struggled with the graphviz. The furthest I got was copying and pasting the textual tree output from DecisionTreeClassifier into "Webgraphviz". The textual CHAID tree from "Teee.from_pandas" was slightly different and would not convert in this way. Thanks - I will keep thinking! – Andrew3193 Sep 05 '19 at 11:30

0 Answers0