0

I want to make a scatter plot with a horizontal and a vertical line. The lines shall appear at the mean of the x and y variable. What I do is:

# Some data
df <- data.frame(x= rnorm(100), y= rnorm(100))
df$i1 <- mean(df$x)
df$i2 <- mean(df$y)

# Plot
library(ggplot2)    
ggplot(data= df, mapping= aes(x= x, y= y)) +
  geom_point() +
  geom_vline(xintercept= unique(i1)) +
  geom_hline(yintercept= unique(i2))

I expect to see a scatter plot with a horizontal line a the mean of y and a vertical line at the mean of x. Instead I get the error Error: object 'i1' not found.

I want to use the values straight from the column of the data argument (here the variables df$i1 and df$i2), i.e. I do not want to put in the intercept values manually and I want to avoid to create a new dataframe containing only the intercepts. Is this possible?

EDIT

In fact my situation includes a grouping variable which I left out for simplicity in the original question. This led to some confusion. The actual data is actually something like this:

Group data:
library(tidyverse)
df <- data.frame(x= rnorm(100), y= rnorm(100), group= sample(1:2, 100, TRUE)) %>%
  group_by(group) %>%
  mutate(i1= mean(x),
         i2= mean(y))

As pointed out in the answer the grouping variable is not relevant to the question and can be simply added later with facet_wrap.

LulY
  • 976
  • 1
  • 9
  • 24
  • use the tidy evaluation pronoun `.data` - for example `.data$i1` – Paul Stafford Allen Jul 24 '23 at 08:59
  • Hmm. Odd - try wrapping the `xintercept = ` in an `aes()`: `geom_vline(aes(xintercept = unique(.data$id1)))`? If that doesn't work I'll fire up R and try and replicate it. – Paul Stafford Allen Jul 24 '23 at 09:02
  • 1
    If you want ggplot2 to take values from the data, you have to map them within `aes`. (If your code didn't require the tidyverse package, which I will never install, I would probably have written an answer.) – Roland Jul 24 '23 at 09:16
  • No you removed the groups, which changes the question. – Roland Jul 24 '23 at 09:19
  • @Roland The groups do not matter. The error remains the same. I mean could've added other fancy stuff like `theme_minimal`, but this does not change the question. – LulY Jul 24 '23 at 09:20
  • They matter. The current answer works without the groups but not with them. – Roland Jul 24 '23 at 09:21
  • @Roland See my answer. – LulY Jul 24 '23 at 09:30
  • The grouping changes how many vlines and hlines you are anticipating, which alters the question. Based on your update, you are also planning to facet wrap, which again is an assumption that is important for how one approaches this. – Paul Stafford Allen Jul 24 '23 at 09:36
  • @PaulStaffordAllen See my answer which works with and without groups. – LulY Jul 24 '23 at 09:38

2 Answers2

0

Based on the original question, the issue was with precomputing the means.

ggplot(df, aes(x = x, y= y)) +
  geom_point() +
  geom_vline(aes(xintercept = mean(x))) +
  geom_hline(aes(yintercept = mean(y)))

plot with hline and vline

Base R solution for grouped data:

# Some data
set.seed(2023)
df <- data.frame(x = rnorm(100), 
                 y = rnorm(100),
                 group = sample(1:2, 100, TRUE))



ggplot(df, aes(x = x, y = y)) +
  geom_point() +
  geom_vline(data = data.frame(xintercept = aggregate(df$x, list(df$group), FUN = mean)$x),
             aes(xintercept = xintercept)) +
  geom_hline(data = data.frame(yintercept = aggregate(df$y, list(df$group), FUN = mean)$x),
             aes(yintercept = yintercept))

gives: enter image description here

Paul Stafford Allen
  • 1,840
  • 1
  • 5
  • 16
0

The code inside geom_hline and geom_vline must be wrapped into aes, i.e:

# For data without groups:
df <- data.frame(x= rnorm(100), y= rnorm(100))
df$i1 <- mean(df$x)
df$i2 <- mean(df$y)

# Plot
library(ggplot2)    
ggplot(data= df, mapping= aes(x= x, y= y)) +
  geom_point() +
  geom_vline(aes(xintercept= i1)) +
  geom_hline(aes(yintercept= i2))

The code for data with groups is the same (so groups do not matter in the question and can be added later). The code is then:

# Group data:
library(tidyverse)
df <- data.frame(x= rnorm(100), y= rnorm(100), group= sample(1:2, 100, TRUE)) %>%
  group_by(group) %>%
  mutate(i1= mean(x),
         i2= mean(y))

# Same code..
    library(ggplot2)    
ggplot(data= df, mapping= aes(x= x, y= y)) +
  geom_point() +
  geom_vline(aes(xintercept= i1)) +
  geom_hline(aes(yintercept= i2)) +
  facet_wrap(. ~ group) # Just add the grouping variable
LulY
  • 976
  • 1
  • 9
  • 24