1

When using rpart to create and plot trees there are a number of functions which can alter the final appearance, however it appears nothing built in which allows for formatting the branch names. Below is an example of (A) what happens normally, and (B) when trying to alter the names using split.fun, and the code to produce this plot.

test <- list()
test$tree <- rpart(Species ~ ., data = iris)
par(mfrow = c(1,2))
rpart.plot(test$tree, type=5, extra=2)
rpart.plot(test$tree, type=5, extra=2, split.fun = function(x, labs, digits, varlen, faclen){
  labs <- gsub(".", " ", labs)
  labs
})

Two trees, both wrong

What I am after is for the Petal.Length and Petal.Width to instead be displayed as Petal Length and Petal Width. Is there any code that can achieve this seemingly simple task?

Beavis
  • 476
  • 3
  • 13
  • Your gsub regular expression changes all characters to blanks. Instead use gsub("\\.", " ", labs) or gsub(".", " ", labs, fixed=TRUE), not gsub(".", " ", labs). – Stephen Milborrow Sep 24 '21 at 22:12

1 Answers1

0

To get what you want I am offering a hack. Not pretty, but it does the job.

If you look at the tree structure, those labels come from test$tree$frame$var. So you can simply change those in the tree.

par(mfrow = c(1,2))
rpart.plot(test$tree, type=5, extra=2)
test$tree$frame$var = sub("\\.", " ", test$tree$frame$var)
rpart.plot(test$tree, type=5, extra=2)

Tree plot with and without periods in the variable names

G5W
  • 36,531
  • 10
  • 47
  • 80