1

We have this simple data frame:

data <- data.frame(ID = rep(c("a","b"), each = 500),
                   time = 1:500,
                val = rnorm(1000, mean=1, sd = 0.3))

We have data for 2 individuals (ID == a and b). We want to subset the data for individual b and make a scatterplot of val vs data_point using dplyr and ggplot2:

library(ggplot2)
library(dplyr)
data%>%
  filter(ID == "b")%>%
  mutate(data_point = c(1:500))%>%
  ggplot(.,)+
    geom_point(aes(x=data_point, y=val), size = 0.5)

enter image description here

Now say we want to make a single data point (say the very first data point/row) larger than the rest, and a different color. How can we do that from inside this pipe, without having to make an object outside of the pipe?

Ryan
  • 1,048
  • 7
  • 14

2 Answers2

3

You can create two variables inside the pipe, one for the highlight color and the other for its size.

library(ggplot2)
library(dplyr)

data %>%
  filter(ID == "b") %>%
  mutate(data_point = 1:500) %>%
  mutate(highlight = data_point == 1,
         size = 0.5 + 10*highlight) %>%
  ggplot(aes(x = data_point, y = val)) +
  geom_point(aes(color = highlight, size = size), show.legend = FALSE) +
  scale_color_manual(values = c("black", "red"))

Another way, without creating those two variables is to apply the same logic to the aesthetics call in geom_point.

data %>%
  filter(ID == "b") %>%
  mutate(data_point = 1:500) %>%
  ggplot(aes(x = data_point, y = val)) +
  geom_point(aes(color = data_point == 1, 
                 size = 0.5 + 10*(data_point == 1)), 
             show.legend = FALSE) +
  scale_color_manual(values = c("black", "red"))

On both case the result is as follows.

enter image description here

Edit

Thanks to @Allan Cameron for having noted in a comment that:

You would only need a single new variable in the first version, then use scale_size

The result is almost the same, with a 0.5 difference in size for the highlighted point.

data %>%
  filter(ID == "b") %>%
  mutate(data_point = 1:500) %>%
  mutate(highlight = data_point == 1) %>%
  ggplot(aes(x = data_point, y = val)) +
  geom_point(aes(color = highlight, size = highlight), show.legend = FALSE) +
  scale_color_manual(values = c("black", "red")) +
  scale_size_manual(values = c(0.5, 10*highlight))
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
0

You could create another column within mutate() to handle point size. In this example, sizing_point is a vector starting with 5, followed by the number 0.5 repeating nrow(.)-1 times, or 499 times in this case.

 #multiple columns can be defined in mutate() separated by a comma
 data %>% 
      filter(ID == "b") %>%
      mutate(sizing_point = c(5,rep(0.5,nrow(.)-1)), data_point = c(1:500))%>%
      ggplot(.,)+
      geom_point(aes(x=data_point, y=val, size= sizing_point)

You could highlight a different point of course. Let's say your wanted to highlight the 75th point. Here is one example of how to could specify the point size for that item.

  #repeats .05 74 times, followed by 5, then repeats 0.5 again to the end of the vector
  c(rep(0.5,74),5,rep(0.5,nrow(.)-75))

And to add color to the highlighted point:

data %>% 
  filter(ID == "b") %>%
  mutate(sizing_point = c(5,rep(0.5,nrow(.)-1)), data_point = c(1:500))%>%
  ggplot(.,)+ geom_point(aes(x=data_point, y=val, size= sizing_point, color=factor(sizing_point)))+
  scale_color_manual(values = c("black", "red"))

 
SEAnalyst
  • 1,077
  • 8
  • 15