Questions tagged [decision-tree]

A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm.

Decision Tree could be just a graphical tool or the learning algorithm in a post.

2545 questions
8
votes
2 answers

Decision tree using rpart to produce a sankey diagram

I can create a tree with Rpart using the Kyphosis data set which is part of base R: fit <- rpart(Kyphosis ~ Age + Number + Start, method="class", data=kyphosis) printcp(fit) plot(fit, uniform=TRUE,main="Classification Tree for…
Matt Lourens
  • 171
  • 9
8
votes
4 answers

Search for corresponding node in a regression tree using rpart

I'm pretty new to R and I'm stuck with a pretty dumb problem. I'm calibrating a regression tree using the rpart package in order to do some classification and some forecasting. Thanks to R the calibration part is easy to do and easy to control. #the…
antoine
  • 123
  • 1
  • 5
8
votes
4 answers

installation graphviz, no module named graphviz

I been trying to install graphviz and connect with python to graph some nodes for decision trees. I had read a lot of threads with the same problem as me but i perform much of the solutions but i still cannot perform my decision trees :( I am not a…
Lucas Dresl
  • 1,150
  • 1
  • 10
  • 19
8
votes
1 answer

is there any way to get samples under each leaf of a decision tree?

I have trained a decision tree using a dataset. Now I want to see which samples fall under which leaf of the tree. From here I want the red circled samples. I am using Python's Sklearn's implementation of decision tree .
Farshid Rayhan
  • 1,134
  • 4
  • 17
  • 31
8
votes
1 answer

Why the decision tree structure is only binary tree for sklearn DecisionTreeClassifier?

As we can see from the sklearn document here, or from my experiment, all the tree structure of DecisionTreeClassifier is binary tree. Either the criterion is gini or entropy, each DecisionTreeClassifier node can only has 0 or 1 or 2 child node. But…
ybdesire
  • 1,593
  • 1
  • 20
  • 35
8
votes
6 answers

Getting the observations in a rpart's node (i.e.: CART)

I would like to inspect all the observations that reached some node in an rpart decision tree. For example, in the following code: fit <- rpart(Kyphosis ~ Age + Start, data = kyphosis) fit n= 81 node), split, n, loss, yval, (yprob) *…
Tal Galili
  • 24,605
  • 44
  • 129
  • 187
8
votes
2 answers

Can we choose what Decision Tree algorithm to use in sklearn?

My question is can we choose what Decision Tree algorithm to use in sklearn? In user guide of sklearn, it mentions optimised version of the CART algorithm is used. Can we change to other algorithms such as C4.5?
8
votes
2 answers

building classification tree having categorical variables using rpart

I have a data set with 14 features and few of them are as below, where sex and marital status are categorical variables. height,sex,maritalStatus,age,edu,homeType SEX 1. Male 2. Female MARITAL STATUS 1. Married …
user4251309
  • 113
  • 1
  • 2
  • 6
8
votes
2 answers

mapping scikit-learn DecisionTreeClassifier.tree_.value to predicted class

I am using a scikit-learn DecissionTreeClassifier on a 3 class dataset. After I fit the classifier I access all leaf nodes on the tree_ attribute in order to get the amount of instances that end up in a given node for each class. clf =…
nemi
  • 183
  • 1
  • 6
8
votes
1 answer

Converting ctree output into JSON Format (for D3 tree layout)

I'm working on a project that requires to run a ctree and then plot it in interactive mode - like the 'D3.js' tree layout, my main obstacle is to convert the ctree output into a json format, to later use by javascript. Following is what i need (with…
Yehoshaphat Schellekens
  • 2,305
  • 2
  • 22
  • 49
8
votes
2 answers

R Error: "In numerical expression has 19 elements: only the first used"

I created a dataframe: totalDeposit <- cumsum(testd$TermDepositAMT[s1$ix]) which is basically calculating cumulative sum of TermDeposit amounts in testd dataframe and storing it in totalDeposit. This works perfectly ok. I then need to calculate the…
Freewill
  • 413
  • 2
  • 6
  • 18
8
votes
1 answer

Why is scikit-learn's random forest using so much memory?

I'm using scikit's Random Forest implementation: sklearn.ensemble.RandomForestClassifier(n_estimators=100, max_features="auto", max_depth=10) After calling…
8
votes
5 answers

Need guidance towards evaluative boolean logic tree

I can't seem to find a pointer in the right direction, I am not even sure what the terms are that I should be researching but countless hours of googling seem to be spinning me in circles, so hopefully the collective hive of intelligence of Stack…
8
votes
2 answers

rapid miner: how to add a 'label' attribute to a dataset?

I want to apply a decision tree learning algorithm to a dataset I have imported from a CSV. The problem is that the "tra" input of the Decision Tree block is still red, stating "Input example set must have special attribute 'label'.". How do I add…
fstab
  • 4,801
  • 8
  • 34
  • 66
8
votes
1 answer

Why is KNN much faster than decision tree?

Once in an interview, I encountered a question from the employer. He asked me why KNN classifier is much faster than decision tree for example in letter recognition or in face recognition? I had completely no idea at that time. So I want to know…
zfz
  • 1,597
  • 1
  • 22
  • 45