4

I am relatively new to R and I'm having a problem reading in multiple tables from a directory using an apply function. What I would like to have the function do is to use a vector with paths to tables that I'm interested in and generate a list with objects consisting of each data frame corresponding the paths in that file. I've written the following code:

f<- function(directory){
    file.list <<- list.files(directory)
    file.paths <<- as.vector(paste(directory, file.list, sep = "/"))
    tables <- lapply(X = file.paths, FUN = read.table, header = TRUE,sep = "\t" ))
}

By my understanding, what I'm doing is creating a list of file names in the directory that I want, creating a path to those files, and (where I'm failing is) looping over those paths and importing the tables they correspond to for the whole file.paths object and generating a list with those tables. I receive the following error:

Error in FUN(X[[i]], ...) : no lines available in input

Can anyone offer any advice?

Jaap
  • 81,064
  • 34
  • 182
  • 193
sam
  • 41
  • 1
  • 1
  • 3
  • 4
    This may happen if your data is empty... as in no rows of data – user20650 May 22 '15 at 14:42
  • Your function looks OK, but don't you want it to return `tables`? – C8H10N4O2 May 22 '15 at 14:43
  • Yes - sorry that part isn't written in, just thought I would fix the error first since it means it can't create tables to begin with – sam May 22 '15 at 14:54
  • There seems to be a syntax error in `tables <- ...` two closing parenthesis? try: `tables <- lapply(file.paths,function(x)read.table(x,header=TRUE,sep="\t"))` – Nightwriter May 23 '15 at 12:40

1 Answers1

12

Here are a few options depending on what you want the output to be:

A list of data frames

# Load library
  library(data.table)

# Get a List of all files named with a key word, say all `.csv` files
  filenames <- list.files("C:/your/folder", pattern="*.csv", full.names=TRUE)

# Load data sets
  list.DFs <- lapply(filenames,fread)

I'm assuming your data files are saved in .csv format. Note that fread is equivalent to read.table but much much faster

Bind multiple data frames into one single data frame

# Get a List of all files named with a key word, say all `.csv` files
  filenames <- list.files("C:/your/folder", pattern="*.csv", full.names=TRUE)

 # Load and bind all data sets
   data <- rbindlist(lapply(filenames,fread))

Load multiple data frames as different objects to Global Environment

# Get a List of DF in the directory
  filenames <- list.files("C:/your/folder", pattern="*.Rda", full.names=TRUE)

# Load data sets
  lapply(filenames, load, .GlobalEnv)
rafa.pereira
  • 13,251
  • 6
  • 71
  • 109
  • Error in FUN(X[[i]], ...) : bad restore file magic number (file may be corrupted) -- no data loaded In addition: Warning message: file ‘az_wac_S000_JT00_2004.csv.gz’ has magic number 'w_geo' Use of save versions prior to 2 is deprecated – Mox Mar 20 '18 at 20:57
  • This seems to be a problem with your file. Note that your file extension is `.gz` and not `.csv` – rafa.pereira Mar 21 '18 at 22:15
  • R can be used to read gzipped files, you just have to put a wrapper on it, like so: read.table(gzfile(az_wac_S000_JT00_2004.csv.gz) – Mox Mar 21 '18 at 22:25
  • If you want to read a bunch of zipped `.csv` files, than you're looking for a slightly different question and solution . I would suggest you post another question, or search for the solution here: https://github.com/Rdatatable/data.table/issues/717 – rafa.pereira Mar 22 '18 at 19:04
  • I have done so here: https://stackoverflow.com/questions/49459779/how-do-i-use-lapply-to-load-files-into-the-global-environment – Mox Mar 23 '18 at 23:56