2

I have a dataset that includes a bunch of variables with various suffixes that I want to make into prefixes. The dataset also includes some variables without any suffixes. Something like:

df <- data.frame(
  home_loc   = rnorm(5),
  work_loc   = rnorm(5),
  x1         = rnorm(5),
  walk_act   = rnorm(5),
  bike_act   = rnorm(5),
  x2         = rnorm(5),
  happy_yest = rnorm(5),
  sad_yest   = rnorm(5)
)

I was able to come up with the following solution:

suff_to_pre <- function(x, suffix, prefix) {
  for (i in seq_along(names(x))) {
    if (grepl(suffix, names(x)[i])) {
      names(x)[i] <- sub(suffix, "", names(x)[i])
      names(x)[i] <- paste0(prefix, names(x)[i])
    }
  }
  names(x)
}

names(df) <- suff_to_pre(df, suffix = "_loc", prefix = "loc_")
names(df) <- suff_to_pre(df, suffix = "_act", prefix = "act_")
names(df) <- suff_to_pre(df, suffix = "_yest", prefix = "yest_")

names(df)
[1] "loc_home" "loc_work" "x1" "act_walk" "act_bike" "x2" "yest_happy"
[8] "yest_sad"

But, I’m not feeling very satisfied with it. Specifically, I would really like a way to get the same result using dplyr. I found this and this, which got me to:

a <- df %>%
  select(ends_with("_loc")) %>%
  setNames(sub("_loc", "", names(.))) %>%
  setNames(paste0("loc_", names(.)))

b <- df %>%
  select(ends_with("_act")) %>%
  setNames(sub("_act", "", names(.))) %>%
  setNames(paste0("act_", names(.)))

c <- df %>%
  select(ends_with("_yest")) %>%
  setNames(sub("_yest", "", names(.))) %>%
  setNames(paste0("yest_", names(.)))

df <- cbind(
  select(df, x1, x2), a, b, c
)

Which is obviously not ideal. I was hoping someone out there suggest a more elegant solution using dplyr.

Edit
@docendo discimus and @zx8754 gave really helpful answers, but I should have been more explicit. I also have variables that include underscores, but are not suffixes that I want to change into prefixes.

For Example (see free_time):

df <- data.frame(
      home_loc   = rnorm(5),
      work_loc   = rnorm(5),
      x_1        = rnorm(5),
      walk_act   = rnorm(5),
      bike_act   = rnorm(5),
      x_2        = rnorm(5),
      happy_yest = rnorm(5),
      sad_yest   = rnorm(5),
      free_time  = rnorm(5)
)
Community
  • 1
  • 1
Brad Cannell
  • 3,020
  • 2
  • 23
  • 39

2 Answers2

4

A single sub call should be enough:

sub("^(.*)_(.*)$", "\\2_\\1", names(df))
#[1] "loc_home"   "loc_work"   "x1"         "act_walk"   "act_bike"   "x2"         "yest_happy" "yest_sad" 

And of course to change the names, assign it back:

names(df) <- sub("^(.*)_(.*)$", "\\2_\\1", names(df))

And in a dplyr-pipe you could use setNames:

df %>% setNames(sub("^(.*)_(.*)$", "\\2_\\1", names(.)))

The pattern "^(.*)_(.*)$" creates two capturing groups, one before the underscore and one after it. And in the replacement "\\2_\\1" we tell R to extract the second group first, then an underscore and finnaly the first group which makes suffixes prefixes. However, if the pattern with an underscore is not found in an entry, nothing is changed.

Update after Question update:

For the slightly more complicated case, you can do the following:

1) store all suffixes that need to be changed to prefixes:

suf <- c("act", "loc", "yest")

2) create a regular expression pattern based on the suffixes:

pat <- paste0("^(.*)_(", paste(suf, collapse = "|"), ")$")
pat
#[1] "^(.*)_(act|loc|yest)$"

3) proceed as before:

sub(pat, "\\2_\\1", names(df))
# [1] "loc_home"   "loc_work"   "x_1"        "act_walk"   "act_bike"   "x_2"        "yest_happy" "yest_sad"   "free_time" 

or

df %>% setNames(sub(pat, "\\2_\\1", names(.)))
talat
  • 68,970
  • 21
  • 126
  • 157
  • Fantastic answer, but I should have been more explicit. I also have variables that include underscores, but are not suffixes that I want to change into prefixes (e.g., free_time). – Brad Cannell Jul 13 '16 at 07:01
  • This answer is very clear and helpful. Thank you. Clearly I need to learn regular expressions. – Brad Cannell Jul 13 '16 at 07:59
1

We can use str_replace from stringr. Here, the idea is to use capture the patterns as a group i.e. within the (..). THe first capture group (([^_])*) indicates zero or more characters that are not _ followed by _ and followed by another capture group (([^_])) and in the replacement we just switch the backreference.

 library(stringr)
 names(df) <- str_replace(names(df), "^([^_]*)_([^_]*)$", "\\2_\\1")
 names(df)
 #[1] "loc_home"   "loc_work"   "x1"         "act_walk" 
 #[5] "act_bike"   "x2"         "yest_happy" "yest_sad"  

If we need to use this with pipes

library(magrittr)
df %<>%
    setNames(str_replace(names(.), "^([^_]*)_([^_]*)$", "\\2_\\1"))

Or without using any regex

sapply(sapply(strsplit(names(df), "_"), rev), paste, collapse="_")
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Many variable names have multiple underscores. But the first underscore always signifies the prefix. How to reverse prefixes to suffixes (and vice versa) based upon reversing them on the first and only the first underscore? – Brad Jun 22 '20 at 05:28