-1

I am interested in learning how to color my scatterplot so that the points are colored based on intervals on the y-axis. I would like to code the following: <10 = blue, 10-20 = orange, 20-30 = green, & >30=purple. Currently, I have color set to factor(n_ct) and I'm getting a rainbow. Also, I keep getting a warning message that says "Width not defined. Set with 'position_dodge(width =?)" even though I have position = "dodge" in there, so I don't know if that is my mistake or rstudios. I would also like to add a line of regression.

performance <- read_csv(performance.csv)
cumulative <- read_csv(cumulative.csv)

performance_2_n_ct <performance %>% select(Sample, AvgCov)
cumulative_2_n_ct <- cumulative %>% select(Sample, n_ct)

cumulative_3_n_ct <- cumulative_2_n_ct %>%
filter(!is.na(n_ct))

all_data_n_ct_ <- left_join(cumulative_3_n_ct, performance_2_n_ct, by = "Sample")

n_ct_finale <- ggplot(data = all_data_n_ct_, aes(y=n_ct, x = AvgCov, color = factor(n_ct))) +
 geom_point(stat = "identity", position = "dodge")

Image of what I have is below.

Picture

Phil
  • 7,287
  • 3
  • 36
  • 66
m.rodwell
  • 39
  • 5

1 Answers1

1

First, you will need to group those values into specific ranges or intervals that are discrete rather than continuous. Simply converting them to a factor won't work. Then, you can simply pass that interval to the color aesthetic.

df <- mtcars %>%
  mutate(interval = case_when(
    mpg < 15 ~ 'interval_1',
    mpg >= 15 & mpg < 20 ~ 'interval_2',
    mpg >=20 & mpg < 30 ~ 'interval_3',
    TRUE ~ 'interval_4'
  ))
ggplot(data = df, aes(x = drat,
                      y = hp,
                      color = interval)) +
  geom_point() 

enter image description here

If you have specific colors that you want to use, you can use scale_color_manual():

... +
scale_color_manual(values = c("red", 
                                "cornflowerblue",
                                "green",
                                "purple"))

enter image description here

cazman
  • 1,452
  • 1
  • 4
  • 11
  • Out of curiosity, do you know how I would change the color scheme of the intervals?? – m.rodwell Jan 11 '22 at 18:35
  • @m.rodwell see edit for specifying colors manually. – cazman Jan 11 '22 at 18:41
  • @m.rodwell Thats not the only way though. You could use `scale_color_viridis_d()`, for example, to get the viridis color scale. – cazman Jan 11 '22 at 18:45
  • One last note, I'm trying to make a line of regression using stat_smooth(method="lm") and I'm getting four lines for each interval, is there a way to get just a singular one for the entire graph?? – m.rodwell Jan 11 '22 at 19:04
  • @m.rodwell 2 options: The easiest (given what I know about your problem) is to move the color argument out of the global `aes()` and pit it into `geom_point()`. So it would be `geom_point(aes(color = interval))`. The other option is to set `inherit.aes = FALSE` inside `stat_smooth`. If you do this, you will need to define the x and y `aes` in stat smooth, because it won't inherit them. – cazman Jan 11 '22 at 20:06