7

I have a folder with about 700 text files that I want to import and add a column to. I've figured out how to do this using the following code:

files = list.files(pattern = "*c.txt")
DF <- NULL
for (f in files) {
  data <- read.table(f, header = F, sep=",")
  data$species <- strsplit(f, split = "c.txt") <-- (column name is filename)
  DF <- rbind(DF, data)
}
write.xlsx(DF,"B:/trends.xlsx")

Problem is, there are about 100 files that are empty. so the code stops at the first empty file and I get this error message:

Error in read.table(f, header = F, sep = ",") : 
  no lines available in input

Is there a way to skip over these empty files?

zx8754
  • 52,746
  • 12
  • 114
  • 209
wallflower
  • 437
  • 1
  • 5
  • 9
  • Do you know how to check if a file is empty? You could add an if statement (`if (file is not empty){do something}`). Of note: there are more efficient ways to do this if you have performance issues. Multiple calls to rbind can be slow. – Heroka Oct 16 '15 at 16:46

2 Answers2

9

You can skip empty files by checking that file.size(some_file) > 0:

files <- list.files("~/tmp/tmpdir", pattern = "*.csv")
##
df_list <- lapply(files, function(x) {
    if (!file.size(x) == 0) {
        read.csv(x)
    }
})
##
R> dim(do.call("rbind", df_list))
#[1] 50  2

This skips over the 10 files that are empty, and reads in the other 10 that are not.


Data:

for (i in 1:10) {
    df <- data.frame(x = 1:5, y = 6:10)
    write.csv(df, sprintf("~/tmp/tmpdir/file%i.csv", i), row.names = FALSE)
    ## empty file
    system(sprintf("touch ~/tmp/tmpdir/emptyfile%i.csv", i))
}
nrussell
  • 18,382
  • 4
  • 47
  • 60
  • This still adds an "entry", with the value of "NULL". Is there any way to not add that entry at all? Having "Null" causes issues downstream for me. – Larry Cai Jun 02 '21 at 10:18
3

For a different approach that introduces explicit error handling, think about a tryCatch to handle anything else bad that might happen in your read.table.

for (f in files) {
    data <- tryCatch({
        if (file.size(f) > 0){
        read.table(f, header = F, sep=",")
           }
        }, error = function(err) {
            # error handler picks up where error was generated
            print(paste("Read.table didn't work!:  ",err))
        })
    data$species <- strsplit(f, split = "c.txt") 
    DF <- rbind(DF, data)
}
Shawn Mehan
  • 4,513
  • 9
  • 31
  • 51