18

I have a data frame with a number of columns in a form var1.mean, var2.mean. I would like to strip the suffix ".mean" from all columns that contain it. I tried using rename_all in conjunction with regex in a pipe but could not come up with a correct syntax. Any suggestions?

linda
  • 191
  • 1
  • 1
  • 6

8 Answers8

29

If you want to use the dplyr package, I'd recommend using the rename_at function.

Dframe <- data.frame(var1.mean = rnorm(10),
                     var2.mean = rnorm(10),
                     var1.sd = runif(10))

library(dplyr)

Dframe %>% 
  rename_at(.vars = vars(ends_with(".mean")),
            .funs = funs(sub("[.]mean$", "", .)))
Benjamin
  • 16,897
  • 6
  • 45
  • 65
28

Using new dplyr:

df %>% rename_with(~str_remove(., '.mean'))
Reza
  • 1,945
  • 1
  • 9
  • 17
10

We can use rename_all

df1 %>%
   rename_all(.funs = funs(sub("\\..*", "", names(df1)))) %>%
   head(2)
#        var1        var2       var3       var1       var2       var3
#1 -0.5458808 -0.09411013  0.5266526 -1.3546636 0.08314367  0.5916817
#2  0.5365853 -0.08554095 -1.0736261 -0.9608088 2.78494703 -0.2883407

NOTE: If the column names are duplicated, it needs to be made unique with make.unique

data

set.seed(24)
df1 <- as.data.frame(matrix(rnorm(25*6), 25, 6, dimnames = list(NULL,
             paste0(paste0("var", 1:3), rep(c(".mean", ".sd"), each = 3)))))
Community
  • 1
  • 1
akrun
  • 874,273
  • 37
  • 540
  • 662
9

You may use gsub.

colnames(df) <- gsub('.mean','',colnames(df))
M--
  • 25,431
  • 8
  • 61
  • 93
SigneMaten
  • 493
  • 1
  • 6
  • 13
1

The below works for me

dat <- data.frame(var1.mean = 1, var2.mean = 2)
col_old <- colnames(dat)
col_new <- gsub(pattern = ".mean",replacement = "", x  = col_old)
colnames(dat) <- col_new
shaojl7
  • 565
  • 4
  • 13
1

You can replace this names using stringi package stri_replace_last_regex function like this:

require(stringi)
df <- data.frame(1,2,3,4,5,6)
names(df) <- stri_paste("var",1:6,c(".mean",".sd"))
df
##  var1.mean var2.sd var3.mean var4.sd var5.mean var6.sd
##1         1       2         3       4         5       6
names(df) <- stri_replace_last_regex(names(df),"\\.mean$","")
df
##  var1 var2.sd var3 var4.sd var5 var6.sd
##1    1       2    3       4    5       6

The regex is \\.mean$ because you need to escape dot character (it has special meaning in regex) and also you can add $ sign at the end to ensure that you replace only names that ENDS with this pattern (if the .mean text is in the middle of string then it wan't be replaced).

bartektartanus
  • 15,284
  • 6
  • 74
  • 102
0

I would use stringsplit:

x <- as.data.frame(matrix(runif(16), ncol = 4))
colnames(x) <- c("var1.mean", "var2.mean", "var3.mean", "something.else")

colnames(x) <- strsplit(colnames(x), split = ".mean")
colnames(x)
Rieneke
  • 151
  • 1
  • 6
0

Lot's of quick answers have been given, the most intuitive, to me would be:

Dframe <- data.frame(var1.mean = rnorm(10),        #Create Example
                     var2.mean = rnorm(10),
                     var1.sd = runif(10))
names(Dframe) <- gsub("[.]mean","",names(Dframe))  #remove ".mean"
Jan Felix
  • 117
  • 9