How can I tell R to apply functions to multiple data?

Question

I have been stacking this work for quite long time, tried different approaches but couldn't succeed.

what I want is to apply following 4 functions to 30 different data (data1,2,3,...data30) within for loop or whatsoever in R. These datasets have same (10) column numbers and different rows.

This is the code I wrote for first data (data1). It works well.

for(i in 1:nrow(data1)){
  data1$simp <-diversity(data1$sp, "simpson")
  data1$shan <-diversity(data1$sp, "shannon")
  data1$E <- E(data1$sp)
  data1$D <- D(data1$sp)
}

I want to apply this code for other 29 data in order not to repeat the process 29 times.

Following code what I am trying to do now. But still not right.

data.list <- list(data1, data2,data3,data4,data5)
for(i in data.list){
  data2 <- NULL
  i$simp <-diversity(i$sp, "simpson")
  i$shan <-diversity(i$sp, "shannon")
  i$E <- E(i$sp)
  i$D <- D(i$sp)
  data2 <- rbind(data2, i)
  print(data2)
}

So I wanna ask how I can tell R to apply functions to other 29 data?

Thanks in advance!

thothal · Answer 1 · 2019-03-13T10:40:42.093

0

There are plenty of options, here's one using only base functions:

data.list <- list(data1, data2, data3, data4, data5)
changed_data <- lapply(data.list, function(my_data) {
    my_data$simp <-diversity(my_data$sp, "simpson")
    my_data$shan <-diversity(my_data$sp, "shannon")
    my_data$E <- E(my_data$sp)
    my_data$D <- D(my_data$sp)
    my_data})

edited Mar 13 '19 at 10:40

answered Mar 13 '19 at 10:05

thothal

16,690
3
36
71

ok, here is question? I have 30 different data frames (data1,2, so on). So I am confused withi which data along with its column I wanna use should I choose? if I choose data1$sp, how about others? – R starter Mar 13 '19 at 10:20
You do not choose any, you use `my_data` which is the parameter of the function inside `lapply` – thothal Mar 13 '19 at 10:33
thank you for your time all! I tried all of above. each one gives me this error: "Error in array(x, c(length(x), 1L), if (!is.null(names(x))) list(names(x), : 'data' must be of a vector type, was 'NULL' " – R starter Mar 13 '19 at 10:57
I am sorry. I have found out what was the error. it was because of my panic I chosen wrong variable for vegan package and I wanna say thank you very much, really appreciated. All codes you suggested work fine. – R starter Mar 13 '19 at 18:53

score 0 · Answer 2 · answered Mar 13 '19 at 10:07

If I understand the question, it you're ultimately asking about your 'data2' variable and how to merge these all together? I think the issue you're having is that you're setting data2 <- NULL with each loop iteration. The proposed solution below moves this definition outside the loop and the call to rbind() should now append all your data frames together to return the consolidated dataset.

data.list <- list(data1, data2,data3,data4,data5) #all 29 can go here
data2 <- NULL
for(i in data.list){

  i$simp <-diversity(i$sp, "simpson")
  i$shan <-diversity(i$sp, "shannon")
  i$E <- E(i$sp)
  i$D <- D(i$sp)
  data2 <- rbind(data2, i)
}
print(data2)

score 0 · Accepted Answer · answered Mar 13 '19 at 10:11

You can do this with Map.

fun <- function(DF){
  for(i in 1:nrow(DF)){
    DF$simp <-diversity(DF$sp, "simpson")
    DF$shan <-diversity(DF$sp, "shannon")
    DF$E <- E(DF$sp)
    DF$D <- D(DF$sp)
  }
  DF
}

result.list <- Map(fun, data.list)

Or, if you don't want to have a function fun in the .GlobalEnv, with lapply.

result.list <- lapply(data.list, function(DF){
  for(i in 1:nrow(DF)){
    DF$simp <-diversity(DF$sp, "simpson")
    DF$shan <-diversity(DF$sp, "shannon")
    DF$E <- E(DF$sp)
    DF$D <- D(DF$sp)
  }
  DF
})

score 0 · Answer 4 · answered Mar 13 '19 at 10:15

I am assuming that your data1, ..., dataN are files stored in a directory and you're reading them one at a time. Also they have the same header.

What you can do is to import them one at a time and then perform the operations you want, as you mentioned:

files <- list.files(directoryPath) #maybe you can grep() some specific files
for (f in files){
  data <- read.table(f) #choose header, sep and so on...
  for(i in 1:nrow(data)){
    data$simp <-diversity(data$sp, "simpson")
    data$shan <-diversity(data$sp, "shannon")
    data$E <- E(data$sp)
    data$D <- D(data$sp)
  }
}

be careful that you must be in the working directory or you must add a path to the filename while reading the tables (i.e. paste(path, f, sep=""))

How can I tell R to apply functions to multiple data?

4 Answers4