1

I did an analysis on my fantasy baseball league, where I had every team play every other team's schedules, to determine their schedule variance and who maybe is luckier than other's based on the schedule. I am plotting the number of wins vs each schedule with a ridgeline plot. I would like to add a vertical line, that only is within a team's specific ridge (impacting no other ridge visually), that shows what their actual number of wins is. I have been able to add a point that shows their actual win total, or a vertical line that goes through the entire visual, or a line at the mean or other quartiles, but not what I am looking for.

dput(BoxPlotData)
structure(list(Team = c("Alex", "Alex", "Alex", "Alex", "Alex", 
"Alex", "Alex", "Alex", "Alex", "Alex", "Alex", "Alex", "Allan", 
"Allan", "Allan", "Allan", "Allan", "Allan", "Allan", "Allan", 
"Allan", "Allan", "Allan", "Allan", "Carter", "Carter", "Carter", 
"Carter", "Carter", "Carter", "Carter", "Carter", "Carter", "Carter", 
"Carter", "Carter", "Derek", "Derek", "Derek", "Derek", "Derek", 
"Derek", "Derek", "Derek", "Derek", "Derek", "Derek", "Derek", 
"Eddie", "Eddie", "Eddie", "Eddie", "Eddie", "Eddie", "Eddie", 
"Eddie", "Eddie", "Eddie", "Eddie", "Eddie", "Hammy", "Hammy", 
"Hammy", "Hammy", "Hammy", "Hammy", "Hammy", "Hammy", "Hammy", 
"Hammy", "Hammy", "Hammy", "Jared", "Jared", "Jared", "Jared", 
"Jared", "Jared", "Jared", "Jared", "Jared", "Jared", "Jared", 
"Jared", "Josh", "Josh", "Josh", "Josh", "Josh", "Josh", "Josh", 
"Josh", "Josh", "Josh", "Josh", "Josh", "Niska", "Niska", "Niska", 
"Niska", "Niska", "Niska", "Niska", "Niska", "Niska", "Niska", 
"Niska", "Niska", "RyanC", "RyanC", "RyanC", "RyanC", "RyanC", 
"RyanC", "RyanC", "RyanC", "RyanC", "RyanC", "RyanC", "RyanC", 
"RyanD", "RyanD", "RyanD", "RyanD", "RyanD", "RyanD", "RyanD", 
"RyanD", "RyanD", "RyanD", "RyanD", "RyanD", "Urby", "Urby", 
"Urby", "Urby", "Urby", "Urby", "Urby", "Urby", "Urby", "Urby", 
"Urby", "Urby"), WinTotal = c(9, 10, 8, 6, 5, 9, 11, 8, 11, 9, 
8, 8, 14, 13, 14, 12, 12, 15, 15, 13, 14, 12, 12, 16, 1, 1, 1, 
1, 2, 2, 4, 3, 0, 2, 5, 0, 8, 9, 7, 6, 6, 8, 11, 5, 10, 7, 4, 
5, 8, 5, 6, 9, 7, 6, 11, 8, 9, 4, 7, 7, 12, 8, 9, 8, 6, 8, 11, 
9, 8, 8, 7, 9, 10, 7, 7, 10, 6, 9, 9, 8, 9, 6, 5, 10, 8, 12, 
7, 11, 7, 10, 14, 9, 13, 9, 9, 12, 11, 14, 14, 11, 11, 11, 14, 
9, 15, 12, 10, 13, 13, 12, 10, 16, 13, 16, 14, 12, 12, 14, 12, 
15, 6, 4, 4, 3, 5, 5, 8, 3, 3, 2, 4, 7, 13, 13, 13, 13, 13, 14, 
16, 13, 14, 14, 11, 13), W.x = c(9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
13L, 13L, 13L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 15L, 15L, 15L, 
15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 14L, 14L, 14L, 14L, 
14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
13L, 13L, 13L, 13L), W.y = c(9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
13L, 13L, 13L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 15L, 15L, 15L, 
15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 15L, 14L, 14L, 14L, 14L, 
14L, 14L, 14L, 14L, 14L, 14L, 14L, 14L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 13L, 
13L, 13L, 13L, 13L)), row.names = c(NA, -144L), class = "data.frame")


ggplot(BoxPlotData, aes(x = WinTotal,y = Team, fill = Team))+
  geom_density_ridges(scale=1)+
  #facet_wrap(~Team)+
  theme_ridges()+
  theme(legend.position = "none")+
  #geom_vline(aes(xintercept=W,linetype = Team), data = ActualWins)
  geom_point(shape=18,size = 4,data = BoxPlotData, aes(x=W,y=Team,fill="black"))

I have commented out a couple attempts to do display it while doing facet_wrap - unhelpful because the ridges stay where they were on the large visual; geom_vline - lines go through the entire set of data.

  • @AllanCameron thanks for the input. It was supposed to all just be BoxPlotData, I must have cut part of it off when pulling it in without realizing. I have included dput(BoxPlotData), which is the full dataset for the entire plot. Thank you! – Clifford Applegarth Aug 23 '22 at 00:28

2 Answers2

1

It is a bit tricky, but what you can do is collect the data behind your first plot without vertical lines using ggplot_build which gives you information about the density per team. The position per line is "W.x" per team and the height of the line can determined to get the max density per team of your geom_density_ridges call. Instead of using geom_vline, you could use geom_segment to create a line per team. Here is a reproducible example:

library(ggplot2)
library(ggridges)
library(dplyr)
# Create dataframe with only one value per team and give each team a number
ActualWins <- BoxPlotData %>% distinct(Team, W.x) %>% mutate(number = row_number())

# Create first the plot to save the data
p <- ggplot(BoxPlotData, aes(x = WinTotal,y = Team, fill = Team))+
  geom_density_ridges(scale=1)+
  theme_ridges()+
  theme(legend.position = "none")+
  geom_point(shape=18,size = 4,data = BoxPlotData, aes(x=W.x,y=Team,fill="black")) 

# Collect data
q <- ggplot_build(p)$data[[1]]
#> Picking joint bandwidth of 0.803

# Select the highest points which are the most wins
density_lines <- q %>%
  group_by(group) %>% 
  filter(density == max(density)) %>% 
  ungroup()

# Join data with Actualwins
density_lines_complete <- left_join(density_lines, ActualWins, by = c("group" = "number"))

# Create Plot with point
ggplot(BoxPlotData, aes(x = WinTotal,y = Team, fill = Team))+
  geom_density_ridges(scale=1)+
  theme_ridges()+
  theme(legend.position = "none")+
  geom_segment(data = density_lines_complete, aes(x = W.x, xend = W.x, y = ymin, yend = ymin+density*scale*iscale), color = "red") +
  geom_point(shape=18,size = 4,data = BoxPlotData, aes(x=W.x,y=Team,fill="black"))
#> Picking joint bandwidth of 0.803


# Create plot without points
ggplot(BoxPlotData, aes(x = WinTotal,y = Team, fill = Team))+
  geom_density_ridges(scale=1)+
  theme_ridges()+
  theme(legend.position = "none")+
  geom_segment(data = density_lines_complete, aes(x = W.x, xend = W.x, y = ymin, yend = ymin+density*scale*iscale), color = "red") 
#> Picking joint bandwidth of 0.803

Created on 2022-08-23 with reprex v2.0.2

Quinten
  • 35,235
  • 5
  • 20
  • 53
1

It is actually possible to do this all with some data manipulation. Pre-calculate the densities and use geom_ridgeline:

BoxPlotData %>%
  group_by(Team) %>%
  summarize(dens = density(WinTotal, from = -2, to = 20, bw = 1, n = 441)$y,
            WinTotal = density(WinTotal, from = -2, to = 20, n = 441)$x,
            is.actual = WinTotal %in% W.x, .groups = "drop") %>%
  mutate(yval = as.numeric(as.factor(Team))) %>%
  ggplot(aes(x = WinTotal, y = Team, fill = Team)) +
  geom_ridgeline(aes(height = dens), scale = 3) +
  theme_ridges() +
  theme(legend.position = "none") +
  geom_segment(data = . %>% filter(is.actual),
               aes(y = yval, yend = yval + 3 * dens, xend = WinTotal))

enter image description here

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Way easier than mine! Good to know thanks! +1 – Quinten Aug 23 '22 at 13:17
  • So I need to spend some time looking at why, but I noticed the line is showing as non whole numbers. The actual win totals would not be fractions. – Clifford Applegarth Aug 23 '22 at 14:00
  • Hi Clifford. My mistake - I thought you wanted the peaks of the density curves. Are you just wanting points where the actual scores were? This is possible too. – Allan Cameron Aug 23 '22 at 14:04
  • @CliffordApplegarth see my update. Is that what you were looking for? – Allan Cameron Aug 23 '22 at 14:07
  • So where I had that diamond point on the chart, I was hoping for that to be a vertical line instead of a point. I think the other example is showing what I am looking for essentially, but preferably not for the line to exceed the curve (like how yours doesn't). I had tried yours first though after they even mentioned yours was easier. – Clifford Applegarth Aug 23 '22 at 14:21
  • @CliffordApplegarth so like my first example above, except the lines rise to meet the density curves instead of a fixed height? – Allan Cameron Aug 23 '22 at 14:38
  • @CliffordApplegarth I have added another version at the bottom of my answer. Is that what you mean? – Allan Cameron Aug 23 '22 at 14:52
  • So that is very nearly there. I only want 1 vertical line, and it should actually be corresponding to the W column, instead of coming from WinTotal. That's probably confusing verbiage on my end. I had put it into 1 table, but there's really only 1 W value for each team, and multiple WinTotal values. – Clifford Applegarth Aug 23 '22 at 14:56
  • @CliffordApplegarth I have added another edit. Is _that_ it? – Allan Cameron Aug 23 '22 at 15:10
  • Not sure all the rules here, can I ask how the n=441 was determined? I have some different datasets of different sizes that I was considering creating this chart for and so far I can't figure out how you got that. – Clifford Applegarth Aug 23 '22 at 17:12