1

I have a list of employees actual capacity (which changes each month) and their scheduled capacity (which is constant every month). I want to use summarize_at to tell what percentage they are over (or under) their allocation. However, I can't figure out how to pipe my custom function through my summarize call. I tried looking at this answer, but my function is different in that it requires input from multiple columns.

Here's a sample dataset:

library(dplyr)
question <- tibble(name = c("justin", "justin", "corey", "corey"),
                   allocation_1 = c(1, 2, 4, 8),
                   allocation_2 = c(2, 4, 11, 9),
                   scheduled_allocation = c(3, 3, 4, 4))

Here's what I want:

library(dplyr)
answer <- tibble(name = c("justin", "corey"),
                 allocation_1 = c(100, 300),
                 allocation_2 = c(200, 500))

And here's what I've got so far. I know the custom function works--I just can't get it to pipe through. X would correspond to their summed allocation (e.g., for justin for allocation1, 1+2 = 3), and Y is their scheduled allocation (e.g., 3--not 6). Thus, 3/3 = 1 *100 = 100 percent allocated.

#custom function that works

get_cap_percent <- function (x, y) {
  100*(x/y)
}

#Code that doesn't work
question %>%
  dplyr::group_by(name) %>%
  summarise_at(vars(contains("allocation_")), sum, na.rm = TRUE) %>%
  summarise_at(vars(contains("allocation_")), get_cap_percent, x = ., y = scheduled_allocation)

J.Sabree
  • 2,280
  • 19
  • 48
  • You should consider using `across()` over `summarise_at` – mhovd Aug 02 '21 at 19:09
  • @mhovd, where should I try it? I tried it in multiple places in the second summarise_at, but I keep getting this error: Error: `across()` must only be used inside dplyr verbs. – J.Sabree Aug 02 '21 at 19:15

1 Answers1

1

We can wrap it in a single summarise as after the summarise step there won't be any other columns except those columns and the grouping

library(dplyr)
question %>% 
    group_by(name) %>%
    summarise(across(contains('allocation_'), ~
     get_cap_percent(sum(., na.rm = TRUE), first(scheduled_allocation))))

-output

# A tibble: 2 x 3
  name   allocation_1 allocation_2
  <chr>         <dbl>        <dbl>
1 corey           300          500
2 justin          100          200
akrun
  • 874,273
  • 37
  • 540
  • 662