0

I need to remove the last column of 10 dataframes, so I decided to put it in lapply(). I wrote a function to remove the col, like below,

remove_col <- function(mydata){
  mydata = subset(mydata, select=-c(24))
}

and create a mylist <- (data1, data2.... data10), then I passed lapply as

lapply(mylist, FUN = remove_col)

It did give me a list of the removed dataframe, however, when I checked the original dataframe, the last column is still there. How should I change the code to change the original dataset?

wawalmx
  • 15
  • 1
  • 4

3 Answers3

1

You need to assign the result of the function call to the input list on the LHS:

mylist <- lapply(mylist, FUN = remove_col)

Had you defined your function with an explicit return value, this might have been more obvious:

remove_col <- function(mydata) {
    mydata <- subset(mydata, select=-c(24))
    return(mydata)   # return the modified list/data frame
}
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
1

Instead of hardcoding the column number to remove you can use ncol to remove the last column from each dataframe.

remove_col <- function(mydata){
  mydata[, -ncol(mydata)]
}
mylist <- lapply(mylist, remove_col)

To see the changes in the original dataframe you can assign names to list of dataframe and use list2env.

names(mylist) <- paste0('data', seq_along(mylist))
list2env(mylist, .GlobalEnv)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

Using base R and lapply, Note, you can remove ", drop = F" from your script if there are more than 2 columns in all dataframes in the list.

> d1
  c1 c2
1  1  6
2  2  7
3  3  8
4  4  9
5  5 10
> d2
  c1 c2
1  5 10
2  4  9
3  3  8
4  2  7
5  1  6
> mylist <- list(d1, d2)
> mylist
[[1]]
  c1 c2
1  1  6
2  2  7
3  3  8
4  4  9
5  5 10

[[2]]
  c1 c2
1  5 10
2  4  9
3  3  8
4  2  7
5  1  6

> lapply(mylist, function(x) x[,1:(ncol(x)-1), drop = F] )
[[1]]
  c1
1  1
2  2
3  3
4  4
5  5

[[2]]
  c1
1  5
2  4
3  3
4  2
5  1

> 
Karthik S
  • 11,348
  • 2
  • 11
  • 25