19

I am using the Australian AIDS Survival Data. This time to create scatterplots.

To show the genders in survival of different Reported transmission category (T.categ), I plot the chart in this way:

data <- read.csv("https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/MASS/Aids2.csv")

data %>%
  ggplot() +
  geom_jitter(aes(T.categ, sex, colour = status))

It shows a chart. But each time I run the code, it seems to produce a different chart. Here are 2 of them putting together.

enter image description here

Anything wrong with the codes? Is it normal (each run a different chart)?

zx8754
  • 52,746
  • 12
  • 114
  • 209
Mark K
  • 8,767
  • 14
  • 58
  • 118

3 Answers3

21

if you use geom_point instead of geom_jitter, you can add position = position_jitter(), which accepts the seed argument:

library(ggplot2)
p <- ggplot(mtcars, aes(as.factor(cyl), disp)) 

p + geom_point(position = position_jitter(seed = 42))


p + geom_point(position = position_jitter(seed = 1))

And back to "42"


p + geom_point(position = position_jitter(seed = 42))

Created on 2020-07-02 by the reprex package (v0.3.0)

tjebo
  • 21,977
  • 7
  • 58
  • 94
9

Try setting the seed when plotting:

set.seed(1); ggplot(data, aes(T.categ, sex, colour = status)) +
  geom_jitter()

From the manual ?geom_jitter:

It adds a small amount of random variation to the location of each point, and is a useful way of handling overplotting caused by discreteness in smaller datasets.

To have that "random variation" reproducible, we need to set set.seed when plotting.

zx8754
  • 52,746
  • 12
  • 114
  • 209
2

If we want to make something random, yet reproducible for permutations etc., we can use sample to set the seed:

my.seed = sample(1:10000,1)
set.seed(my.seed)

Then we can use it to write a filename such as:

save(my_plot, paste0('plot', my.seed, '.rda')
zx8754
  • 52,746
  • 12
  • 114
  • 209
Joyvalley
  • 154
  • 1
  • 7
  • 2
    Is there a theoretical/citable reason for why this would give more/better "randomness" than just using 1:10000 directly as the seeds? – Magnus Jan 29 '22 at 16:13