16

I'm trying to use lapply on a list of data frames; but failing at passing the parameters correctly (I think).

List of data frames:

df1 <- data.frame(A = 1:10, B= 11:20)
df2 <- data.frame(A = 21:30, B = 31:40) 

listDF <- list(df1, df2,df3)    #multiple data frames w. way less columns than the length of vector todos

Vector with columns names:

todos <-c('col1','col2', ......'colN')

I'd like to change the column names using lapply:

lapply (listDF, function(x) { colnames(x)[2:length(x)] <-todos[1:length(x)-1] }  )

but this doesn't change the names at all. Am I not passing the data frames themselves, but something else? I just want to change names, not to return the result to a new object.

Thanks in advance, p.

Jaap
  • 81,064
  • 34
  • 182
  • 193
user3310782
  • 811
  • 2
  • 10
  • 18
  • would not work because of *R's calling by value* – jogo Nov 06 '15 at 12:46
  • 2
    Just add an `x` to the end `lapply (listDF, function(x) { colnames(x)[2:length(x)] <-todos[1:length(x)-1];x } )`. Your function as written has no return value. – Pierre L Nov 06 '15 at 12:53
  • 1
    Not related to the question, but I guess that `1:length(x)-1` is a common error (sometimes not harmful). The right line is `1:(length(x)-1)` (beware precedence!) – nicola Nov 06 '15 at 12:55

3 Answers3

24

You can also use setNames if you want to replace all columns

df1 <- data.frame(A = 1:10, B= 11:20)
df2 <- data.frame(A = 21:30, B = 31:40) 

listDF <- list(df1, df2)
new_col_name <- c("C", "D")

lapply(listDF, setNames, nm = new_col_name)
## [[1]]
##     C  D
## 1   1 11
## 2   2 12
## 3   3 13
## 4   4 14
## 5   5 15
## 6   6 16
## 7   7 17
## 8   8 18
## 9   9 19
## 10 10 20

## [[2]]
##     C  D
## 1  21 31
## 2  22 32
## 3  23 33
## 4  24 34
## 5  25 35
## 6  26 36
## 7  27 37
## 8  28 38
## 9  29 39
## 10 30 40

If you need to replace only a subset of column names, then you can use the solution of @Jogo

lapply(listDF, function(df) {
  names(df)[-1] <- new_col_name[-ncol(df)]
  df
})

A last point, in R there is a difference between a:b - 1 and a:(b - 1)

1:10 - 1
## [1] 0 1 2 3 4 5 6 7 8 9

1:(10 - 1)
## [1] 1 2 3 4 5 6 7 8 9

EDIT

If you want to change the column names of the data.frame in global environment from a list, you can use list2env but I'm not sure it is the best way to achieve want you want. You also need to modify your list and use named list, the name should be the same as name of the data.frame you need to replace.

listDF <- list(df1 = df1, df2 = df2)

new_col_name <- c("C", "D")

listDF <- lapply(listDF, function(df) {
  names(df)[-1] <- new_col_name[-ncol(df)]
  df
})

list2env(listDF, envir = .GlobalEnv)
str(df1)
## 'data.frame':    10 obs. of  2 variables:
##  $ A: int  1 2 3 4 5 6 7 8 9 10
##  $ C: int  11 12 13 14 15 16 17 18 19 20
dickoa
  • 18,217
  • 3
  • 36
  • 50
  • Check the output vs the OP's code. The line `colnames(x)[2:length(x)] ` indicates that the replacement begins at the second column. – Pierre L Nov 06 '15 at 12:51
  • @PierreLafortune Thanks Pierre you are right, I made some adjustement – dickoa Nov 06 '15 at 13:00
  • Sorry dickoa but lapply only changes the column names *inside* the list (so df1 and df2 still have the original col. names !!). I've also tried adding the 'x' from Pierre but still that doesn't do the trick. I only use the list to hold a long list of DFs, not that I want changes inside the list itself. Any ideas? thanks – user3310782 Nov 06 '15 at 13:04
  • then you have to use a for-loop without a additional function to call – jogo Nov 06 '15 at 13:34
  • @user3310782 Look at the updated answer to see if it does want you want. However, I think that a for loop will probably easier in this case. – dickoa Nov 06 '15 at 13:35
  • @jogo What do you mean by that ? can you elaborate please – dickoa Nov 06 '15 at 13:36
  • `for (i in 1:length(listDF)) names(listDF[[i]])[-1] <- todos[-length(listDF[[i]])]` – jogo Nov 06 '15 at 13:45
  • It seems the Edit by dickoa does the trick!! I didn't get why you apply it but data frames keep untouched. The key is in: listDF <- lapply(listDF, function(df) { names(df)[-1] <-new_col_name[-ncol(df)] df }) that passes the name of the frame, and in naming the data frames in the list. Thanks all. – user3310782 Nov 06 '15 at 13:48
  • Thanks jogo, your FOR ..loop also seems a good solution, I just thought of lapply first. – user3310782 Nov 06 '15 at 14:29
1

try this:

lapply (listDF, function(x) { 
  names(x)[-1] <- todos[-length(x)]
  x 
})

you will get a new list with changed dataframes. If you want to manipulate the listDF directly:

for (i in 1:length(listDF)) names(listDF[[i]])[-1] <- todos[-length(listDF[[i]])]
jogo
  • 12,469
  • 11
  • 37
  • 42
  • Thanks jogo but why doesn't it change the data frame column names ? It only changes the column names inside the list, not in the independent DF. – user3310782 Nov 06 '15 at 13:12
  • but you can do `listDF <- lapply(...)` – jogo Nov 06 '15 at 13:22
  • Sure, but if you wish to change the DF column names, is then lapply NOT the way to go? – user3310782 Nov 06 '15 at 13:26
  • yes, but a function can never change the values recieving during the call. The function can only return an object. lapply() is calling a function for each element of the list. – jogo Nov 06 '15 at 13:28
1

I was not able to get the code used in these answers to work. I found some code from another forum which did work. This will assign the new column names into each dataframe, the other methods created a copy of the dataframes. For anyone else here is the code.

# Create some dataframes
df1 <- data.frame(A = 1:10, B= 11:20)
df2 <- data.frame(A = 21:30, B = 31:40)

listDF <- c("df1", "df2") #Notice this is NOT a list
new_col_name <- c("C", "D") #What do you want the new columns to be named?

# Assign the new column names to each dataframe in "listDF"
for(df in listDF) {
  df.tmp <- get(df)
  names(df.tmp) <- new_col_name
  assign(df, df.tmp)
}
Patrick
  • 915
  • 2
  • 9
  • 26