2

In principal component analysis, I extract component result in prcomp() for scatter plot. I want to add the labels of the group names, then the center of each groups is calculated in each groups using MASS::cov.trob(). I create new stat and rebuild new geom using ggplot2::ggproto() in order to exhibit the labels of each groups. However, new graph has the unreasonable legend due to it should be a point-legend rather than character-legend. I've tried multiple variations of this, but none of them seem to work. Any ideas? Here is my code:

# data
data(Cars93, package = "MASS")
car_df <- Cars93[, c(3, 5, 13:15, 17, 19:25)]
car_df <- subset(car_df, Type == "Large" | Type == "Midsize" | Type == "Small")
x1 <- mean(car_df$Price) + 2 * sd(car_df$Price)
x2 <- mean(car_df$Price) - 2 * sd(car_df$Price)
car_df <- subset(car_df, Price > x2 | Price < x1)
car_df <- na.omit(car_df)

# Principal Component Analysis
car.pca <- prcomp(car_df[, -1], scale = T)
car.pca_pre <- cbind(as.data.frame(predict(car.pca)[, 1:2]), car_df[, 1])
colnames(car.pca_pre) <- c("PC1", "PC2", "Type")
head(car.pca_pre)

# create a new stat
library(ggplot2)
StatLabel <- ggproto("StatLabel" ,Stat,
               compute_group = function(data, scales) {
                library(MASS)
                df <- data.frame(data$x,data$y)
                center <- cov.trob(df)$center
                names(center)<- NULL 
                center <- t(as.data.frame(center))
                center <- as.data.frame(cbind(center))
                colnames(center) <- c("x","y")
                rownames(center) <- NULL
                return(center)
                },
                required_aes = c("x", "y")
)

stat_label <- function (mapping = NULL, data = NULL, stat = "identity", position = "identity", 
    ..., parse = FALSE, nudge_x = 0, nudge_y = 0, label.padding = unit(0.15, 
        "lines"), label.r = unit(0.15, "lines"), label.size = 0.1, 
    na.rm = FALSE, show.legend = NA, inherit.aes = TRUE) 
{
    if (!missing(nudge_x) || !missing(nudge_y)) {
        if (!missing(position)) {
            stop("Specify either `position` or `nudge_x`/`nudge_y`", 
                call. = FALSE)
        }
        position <- position_nudge(nudge_x, nudge_y)
    }
    layer(data = data, mapping = mapping, stat = StatLabel, geom = GeomLabel, 
        position = position, show.legend = show.legend, inherit.aes = inherit.aes, 
        params = list(parse = parse, label.padding = label.padding, 
            label.r = label.r, label.size = label.size, na.rm = na.rm, 
            ...))
}

# plot
ggplot(car.pca_pre, aes(PC1, PC2, color = Type)) + geom_point() + 
stat_label(aes(label = Type))

enter image description here

Mike Wise
  • 22,131
  • 8
  • 81
  • 104
Dr Duck
  • 207
  • 3
  • 9

1 Answers1

2

I don't think it would be very natural to have your new stat show points in the legend, since it doesn't plot any points. As is stands, ggplot appears to give priority over the text legend when both points and text have a combined legend. The simplest solution would be to not have a legend by default for your label stat.

You can change your function to have show.legend = FALSE as the default, and then your plot will show the point legend instead.

stat_label <- function (mapping = NULL, 
                        data = NULL, 
                        stat = "identity", 
                        position = "identity", 
                        ..., 
                        parse = FALSE, 
                        nudge_x = 0, nudge_y = 0, 
                        label.padding = unit(0.15, "lines"), 
                        label.r = unit(0.15, "lines"), 
                        label.size = 0.1, 
                        na.rm = FALSE, 
                        show.legend = FALSE,       ## <--- change
                        inherit.aes = TRUE) 
{
  if (!missing(nudge_x) || !missing(nudge_y)) {
    if (!missing(position)) {
      stop("Specify either `position` or `nudge_x`/`nudge_y`", 
           call. = FALSE)
    }
    position <- position_nudge(nudge_x, nudge_y)
  }
  layer(data = data, mapping = mapping, stat = StatLabel, geom = GeomLabel, 
        position = position, show.legend = show.legend, inherit.aes = inherit.aes, 
        params = list(parse = parse, label.padding = label.padding, 
                      label.r = label.r, label.size = label.size, na.rm = na.rm, 
                      ...))
}

# plot
ggplot(car.pca_pre, aes(PC1, PC2, color = Type)) + geom_point() + 
  stat_label(aes(label = Type))

enter image description here

Axeman
  • 32,068
  • 8
  • 81
  • 94
  • Thanks! I accomplished my new geom which is familiar to graph showed in `adegraphics::s.class` [link](http://enterotype.embl.de/images/between.png) – Dr Duck Mar 02 '17 at 10:00