2

As a beginner to R, I create a classification tree using the provided "car.test.frame" dataset that predicts mileage based on the country. My commands entered were:

> z.auto <- rpart(Mileage ~ Country, car.test.frame, method="class")
> plot(z.auto)
> text(z.auto)   

This resulted in the following tree: Classification Tree for Mileage

As you can see, at the top level, Country=cegh provides the first split and Country=egh provides the 2nd split. How do I change those to reflect actual country names? And how do I actually understand the chart?

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
  • 4
    We would be happy to answer your last question (about how to understand a classification tree), but the details of the `R` code are best asked about on StackOverflow. – whuber Dec 31 '14 at 18:24
  • I flagged to move this to SO, but you could try changing the factor labels with the confusingly-named `levels` function – shadowtalker Jan 01 '15 at 20:04

1 Answers1

0

Interesting question. Another person named Andrie de Vries seems to have faced a similar issue because he developed a package specifically for visualizing this type of plot using ggplot which came out this year. To solve your problem, download ggdendro. To look at pretty pictures, see the vignette.

fitr <- dendro_data(z.auto)
fitr$labels$label<- c("Country= Japan,Korea,Sweden,USA", "Country= Korea,Sweden,USA")
ggplot() +
  geom_segment(data=fitr$segments, aes(x=x, y=y, xend=xend, yend=yend)) +
  geom_text(data=fitr$labels, aes(x=x, y=y, label=label)) +
  geom_text(data=fitr$leaf_labels, aes(x=x, y=y, label=label)) +
  theme_dendro()
polka
  • 1,383
  • 2
  • 25
  • 40