2

The dataset below has columns with very similar names and some values which are NA.

library(tidyverse)

dat <- data.frame(
v1_min = c(1,2,4,1,NA,4,2,2),
v1_max = c(1,NA,5,4,5,4,6,NA),
other_v1_min = c(1,1,NA,3,4,4,3,2),
other_v1_max = c(1,5,5,6,6,4,3,NA),
y1_min = c(3,NA,2,1,2,NA,1,2),
y1_max = c(6,2,5,6,2,5,3,3),
other_y1_min = c(2,3,NA,1,1,1,NA,2),
other_y1_max = c(5,6,4,2,NA,2,NA,NA)
)

head(dat)

In this example, x1 and y1 would be what I would consider the common "categories" among the columns. In order to get something similar with my current dataset, I had to use grepl to tease these out

cats<-dat %>%
  names() %>%
  gsub("^(.*)_(min|max)", "\\1",.) %>%
  gsub("^(.*)_(.*)", "\\2",.) %>%
  unique()

Now, my goal is to mutate a new min and a new max column for each of those categories. So far the code below works just fine.

dat %>%
  rowwise() %>%
  mutate(min_v1 = min(c_across(contains(cats[1])), na.rm=T)) %>%
  mutate(max_v1 = max(c_across(contains(cats[1])), na.rm=T)) %>%
  mutate(min_y1 = min(c_across(contains(cats[2])), na.rm=T)) %>%
  mutate(max_y1 = max(c_across(contains(cats[2])), na.rm=T))

However, the number of categories in my current dataset is quite a bit bigger than 2.. Is there a way to implement this but quicker?

I've tried a few of the suggestions on this post but haven't quite been able to extend them to this problem.

Kim
  • 21
  • 1

1 Answers1

1

You can use one of the map function here for each common categories.

library(dplyr)
library(purrr)

result <- bind_cols(dat, map_dfc(cats,
            ~dat %>%
              rowwise() %>%
              transmute(!!paste('min', .x, sep = '_') := min(c_across(matches(.x)), na.rm = TRUE), 
                        !!paste('max', .x, sep = '_') := max(c_across(matches(.x)), na.rm = TRUE))))

result
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Ahhh ok, yes!! I looked into purrr earlier but didn't know what to look for, lol. Thanks so much! – Kim Feb 17 '21 at 05:37