1

I want to find out how long the median time from last vaccine till admission to ICU is. I have a dataset looking like this:

ID <- c(1:20)
Group <- c("1. vacc + unvacc", "2. vacc", "3. vacc", "1. vacc + unvacc", "2. vacc", "3. vacc", "1. vacc + unvacc", "2. vacc", "3. vacc",
           "1. vacc + unvacc", "2. vacc", "3. vacc", "1. vacc + unvacc", "2. vacc", "3. vacc", "1. vacc + unvacc", "2. vacc", "3. vacc",
           "1. vacc + unvacc", "2. vacc")
Status <- c("Dead", "Alive", "Dead", "Alive", "Dead", "Alive", "Dead", "Alive", "Dead", "Alive", "Dead", "Alive", 
            "Dead", "Alive", "Dead", "Alive", "Dead", "Alive", "Dead", "Alive")

Months <- c(5, 6, 1, 2, 3, 5, 2, 8, 2, 5, 5, 6, 1, 2, 3, 5, 2, 8, 2, 5)

I have tried:

df %>%
  group_by(Group, Status) %>%
  summarise(ICU_median = median(Months)) %>%
  pivot_wider(
    names_from = Status,
    values_from = ICU_median
  ) %>%
  adorn_totals("col", name = "All") %>% 
  adorn_totals("row", name = "Overall") %>% 
  pivot_longer(
    cols= -Group,
    names_to = "Status", 
    values_to = "ICU median"
  ) 

Which gives the following output:

`summarise()` has grouped output by 'Group'. You can override using the `.groups` argument.
# A tibble: 12 x 3
   Group            Status `ICU median`
   <chr>            <chr>         <dbl>
 1 1. vacc + unvacc Alive           5  
 2 1. vacc + unvacc Dead            2  
 3 1. vacc + unvacc All             7  
 4 2. vacc          Alive           5.5
 5 2. vacc          Dead            3  
 6 2. vacc          All             8.5
 7 3. vacc          Alive           6  
 8 3. vacc          Dead            2  
 9 3. vacc          All             8  
10 Overall          Alive          16.5
11 Overall          Dead            7  
12 Overall          All            23.5

Which is almost what I want, but stil far from - since I want to know what the median months from last vaccine till ICU admission for every subgroup. My problem is, that the "All" group is being added together, probably from the adorn_totals but i can't figure out how add and divide by 2 since I dont want the summed value but the median for all within the 1. vacc + unvacc. Furthermore, would it be possible to insert two more columns with 25. and 75. quantiles for each row/subgroup? (that has not worked for me at all). Desired output would be:

`summarise()` has grouped output by 'Group'. You can override using the `.groups` argument.
# A tibble: 12 x 3
   Group            Status `ICU median`  25. quartile  75. quartile
   <chr>            <chr>         <dbl>         <dbl>         <dbl>
 1 1. vacc + unvacc Alive           5               x             y
 2 1. vacc + unvacc Dead            2               z             d
 3 1. vacc + unvacc All             **3.5**  
 4 2. vacc          Alive           5.5
 5 2. vacc          Dead            3  
 6 2. vacc          All             **4.25**
 7 3. vacc          Alive           6  
 8 3. vacc          Dead            2  
 9 3. vacc          All             **4**  
10 Overall          Alive          Average median for all alive
11 Overall          Dead           Average median for all dead  
12 Overall          All            Average median for all
Nick Meier
  • 33
  • 4
  • I'm confused by the calculation for "All" for a `Group` - wouldn't you want the median for all cases in the `Group` (both alive and dead), instead of taking the median of two medians? – Ben Dec 25 '21 at 15:42
  • Hey @Ben you are completely correct, I must have had my thoughts somewhere else at a bad time. Thank you :-) – Nick Meier Dec 25 '21 at 17:10

1 Answers1

0

Something like this? Please clarify:

library(dplyr)
library(tidyr)
library(janitor)

df %>%
  group_by(Group, Status) %>%
  summarise(ICU_median = median(Months)) %>%
  pivot_wider(
    names_from = Status,
    values_from = ICU_median
  ) %>%
  adorn_totals("col", name = "All") %>% 
  rowwise() %>% 
  mutate(median_x = median(c(Alive, Dead))) %>% 
  pivot_longer(
    cols= -Group,
    names_to = "Status", 
    values_to = "ICU median"
  ) %>% 
  mutate(Group= ifelse(Status == "median_x", "Overall", Group)) %>% 
  arrange(Group) %>% 
  rename(ICU_median = `ICU median`) %>% 
  mutate(ICU_median = ifelse(Status == "All", ICU_median/2, ICU_median)) 
# A tibble: 12 x 3
   Group            Status   ICU_median
   <chr>            <chr>         <dbl>
 1 1. vacc + unvacc Alive          5   
 2 1. vacc + unvacc Dead           2   
 3 1. vacc + unvacc All            3.5 
 4 2. vacc          Alive          5.5 
 5 2. vacc          Dead           3   
 6 2. vacc          All            4.25
 7 3. vacc          Alive          6   
 8 3. vacc          Dead           2   
 9 3. vacc          All            4   
10 Overall          median_x       3.5 
11 Overall          median_x       4.25
12 Overall          median_x       4   
TarJae
  • 72,363
  • 6
  • 19
  • 66
  • Hey again TarJae. I see, that I need to be more clear in what I wished for: I would like the "All" rows to contain the median value for both dead and alive in the ICU_median column, as @Ben also stated in an above comment. Furthermore (since my skills are very limited) would be it possible to add two columns more of percentiles (e.g. 25. and 75.?) – Nick Meier Dec 25 '21 at 17:09
  • have you seen my comment? – Nick Meier Dec 30 '21 at 10:09