Questions tagged [rpart]

An R package for fitting classification and regression trees.

rpart is an package for fitting and trees ().

Repositories

Vignettes

Other resources

Related tags

445 questions
5
votes
1 answer

Using rpart: How to get more variability on predictions?

I am using the rpart package like so: model <- rpart(totalUSD ~ ., data = df.train) I notice that over 80k rows, rpart is generalizing it's predictions into just three distinct groups as shown in the image below: I see several configuration…
user1477388
  • 20,790
  • 32
  • 144
  • 264
5
votes
1 answer

how do I get rpart to work with increased number of factors?

I observe that just for the rpart package (for decision tree models), as I increase the number of factor levels in my data, the package slows down drastically. I have compared with other packages, and only for rpart, this seems to be the case. Below…
IAMTubby
  • 1,627
  • 4
  • 28
  • 40
5
votes
3 answers

Getting invalid model formula in ExtractVars when using rpart function in R

The dataset can be downloaded from http://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/ Getting the following error: formula(formula, data = data) : invalid model formula in ExtractVars Using the following…
dgene54
  • 81
  • 1
  • 3
  • 7
5
votes
3 answers

Can someone explain me the difference between ID3 and CART algorithm?

I have to create decision trees with the R software and the rpart Package. In my paper I should first define the ID3 algorithm and then implement various decision trees. I found out that the rpart package does not work with the ID3 algorithm. It…
user2988757
  • 105
  • 1
  • 1
  • 8
4
votes
1 answer

Using minsplit and unequal weights in rpart

How do I incorporate weights into the minsplit criteria in rpart, when the weights are uneven? I could not find a way for the minsplit threshold to take the weights into account, and when the weights are uneven it becomes an issue, as the following…
Saar
  • 66
  • 1
  • 5
4
votes
2 answers

print decision tree in text nicely / with custom control [r]

I'd like to print a decision tree in text nicely. For example, I can print the tree object itself: library(rpart) f = as.formula('Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width + Species') fit = rpart(f, data = iris, control =…
YJZ
  • 3,934
  • 11
  • 43
  • 67
4
votes
3 answers

Error in eval(predvars, data, env) : object 'Rm' not found

dataset = read.csv('dataset/housing.header.binary.txt') dataset1 = dataset[6] #higest positive correlation dataset2 = dataset[13] #lowest negative correlation dependentVal= dataset[14] #dependent value new_dataset = cbind(dataset1,dataset2,…
4
votes
1 answer

Optimising caret for sensitivity still seems to optimise for ROC

I'm trying to maximise sensitivity in my model selection in caret using rpart. To this end, I tried to replicate the method given here (scroll down to the example with the user-defined function FourStat) caret's github page # create own function so…
chrisjacques
  • 635
  • 1
  • 5
  • 17
4
votes
4 answers

export rpart rules to a data frame and link rules to train data

I have trained some data with rpart and interested in labeling each observation with the tree terminal node, and link to the rule corresponding to that terminal node. I have used the following code as example: library(rpart) library(rattle) fit <-…
kamashay
  • 93
  • 1
  • 9
4
votes
1 answer

Adding informations to tree - Rpart

I want to add some information to my tree. Let's say for instance I have a database like this : library(rpart) library(rpart.plot) set.seed(1) mydb<-data.frame(results=rnorm(1000,0,1),expo=runif(1000),var1=sample(LETTERS[1:4],1000,replace=T), …
Rhesous
  • 984
  • 6
  • 12
4
votes
2 answers

how do duplicated rows effect a decision tree?

I am using Rpart{} to build a decision tree for a categorical variable and I am wondering whether I should use the full data set of just the set of unique rows.
Mouad_Seridi
  • 2,666
  • 15
  • 27
4
votes
1 answer

Rpart - Variable Importance Vector - how?

Ive been searching the internet for a while now to understand the numeric 'ranking' statistic that rpart assigns to a variable on the variable importance output. I understand that this number adds to 100 but what exactly is it, what is it called…
Mak87
  • 41
  • 1
  • 3
4
votes
0 answers

How to set threshold of class counter / probability to label "predicted class" in R rpart

I am using rpart function to get a decision tree to predict Owner / No-owner based on a set of variables .Below is the excerpt of output Node number 2: 8 observations, complexity param=0.08333333 predicted class=non-owner expected loss=0.125 …
Sanjeev G
  • 49
  • 1
4
votes
2 answers

Find the data elements in a data frame that pass the rule for a node in a tree model?

So I have used the rpart package to create a tree model and I found an interesting rule and wondered if there was an easy way to see which observations in that data frame pass that rule. It seems very tedious to use path.rpart to find the path it…
mandroid
  • 2,308
  • 5
  • 24
  • 37
4
votes
2 answers

What does the rpart "Error in as.character(x) : cannot coerce type 'builtin' to vector of type 'character' " message mean?

I've been banging my head against rpart for a few days now (trying to make classification trees for this dataset that I have), and I think it's time to ask a lifeline at this point :-) I'm sure it's something silly that I'm not seeing, but here's…
user281537
  • 111
  • 1
  • 2
  • 4