I have a data frame with a number of columns in a form var1.mean, var2.mean. I would like to strip the suffix ".mean" from all columns that contain it. I tried using rename_all in conjunction with regex in a pipe but could not come up with a correct syntax. Any suggestions?
Asked
Active
Viewed 2.7k times
8 Answers
29
If you want to use the dplyr
package, I'd recommend using the rename_at
function.
Dframe <- data.frame(var1.mean = rnorm(10),
var2.mean = rnorm(10),
var1.sd = runif(10))
library(dplyr)
Dframe %>%
rename_at(.vars = vars(ends_with(".mean")),
.funs = funs(sub("[.]mean$", "", .)))

Benjamin
- 16,897
- 6
- 45
- 65
-
Inside the rename_at(), ¿Why do you include .vars and .funs? – axme100 Dec 12 '18 at 17:00
-
1those are argument names to `rename_at`. – Benjamin Dec 12 '18 at 18:04
-
3`funs` and `rename_at` have been deprecated/superceded. You should now use `rename_with(~ gsub("[.]mean$", "", .x)` – Brian D Aug 06 '21 at 15:14
28
Using new dplyr:
df %>% rename_with(~str_remove(., '.mean'))

Reza
- 1,945
- 1
- 9
- 17
-
-
This also needs $ at the end (eg. `".mean$"`) to specify that the patten needs to be matched at the end of the column name (i.e suffix) – Dr Bala Soundararaj May 22 '23 at 18:00
10
We can use rename_all
df1 %>%
rename_all(.funs = funs(sub("\\..*", "", names(df1)))) %>%
head(2)
# var1 var2 var3 var1 var2 var3
#1 -0.5458808 -0.09411013 0.5266526 -1.3546636 0.08314367 0.5916817
#2 0.5365853 -0.08554095 -1.0736261 -0.9608088 2.78494703 -0.2883407
NOTE: If the column names are duplicated, it needs to be made unique with make.unique
data
set.seed(24)
df1 <- as.data.frame(matrix(rnorm(25*6), 25, 6, dimnames = list(NULL,
paste0(paste0("var", 1:3), rep(c(".mean", ".sd"), each = 3)))))
9
You may use gsub
.
colnames(df) <- gsub('.mean','',colnames(df))

M--
- 25,431
- 8
- 61
- 93

SigneMaten
- 493
- 1
- 6
- 13
1
The below works for me
dat <- data.frame(var1.mean = 1, var2.mean = 2)
col_old <- colnames(dat)
col_new <- gsub(pattern = ".mean",replacement = "", x = col_old)
colnames(dat) <- col_new

shaojl7
- 565
- 4
- 13
-
Using `pattern = "[.]mean$"` will ensure you only change variable names that end in `.mean`. – Benjamin Aug 30 '17 at 12:30
1
You can replace this names using stringi
package stri_replace_last_regex
function like this:
require(stringi)
df <- data.frame(1,2,3,4,5,6)
names(df) <- stri_paste("var",1:6,c(".mean",".sd"))
df
## var1.mean var2.sd var3.mean var4.sd var5.mean var6.sd
##1 1 2 3 4 5 6
names(df) <- stri_replace_last_regex(names(df),"\\.mean$","")
df
## var1 var2.sd var3 var4.sd var5 var6.sd
##1 1 2 3 4 5 6
The regex is \\.mean$
because you need to escape dot character (it has special meaning in regex) and also you can add $
sign at the end to ensure that you replace only names that ENDS with this pattern (if the .mean
text is in the middle of string then it wan't be replaced).

bartektartanus
- 15,284
- 6
- 74
- 102
0
I would use stringsplit:
x <- as.data.frame(matrix(runif(16), ncol = 4))
colnames(x) <- c("var1.mean", "var2.mean", "var3.mean", "something.else")
colnames(x) <- strsplit(colnames(x), split = ".mean")
colnames(x)

Rieneke
- 151
- 1
- 6
0
Lot's of quick answers have been given, the most intuitive, to me would be:
Dframe <- data.frame(var1.mean = rnorm(10), #Create Example
var2.mean = rnorm(10),
var1.sd = runif(10))
names(Dframe) <- gsub("[.]mean","",names(Dframe)) #remove ".mean"

Jan Felix
- 117
- 9