6

Given several .xls files with varying number of sheets, I am reading them into R usingread.xls from the gdata package. I have two related issues (solving the second issue should solve the first):

  1. It is unknown ahead of time how many sheets each .xls file will have, and in fact this value will vary from one file to the next.
  2. I need to capture the name of the sheet, which is relevant data

Right now, to resolve (1), I am using try() and iterating over sheet numbers until I hit an error.

How can I grab a list of the names of the sheet so that I can iterate over them?

smci
  • 32,567
  • 20
  • 113
  • 146
Ricardo Saporta
  • 54,400
  • 17
  • 144
  • 178

2 Answers2

9

See the sheetCount and sheetNames functions (on same help page) in gdata. If xls <- "a.xls", say, then reading all sheets of a spreadsheet into a list, one sheet per component, is just this:

sapply(sheetNames(xls), read.xls, xls = xls, simplify = FALSE)

Note that the components will be named using the names of the sheets. Depending on the content it might make sense to remove simplify = FALSE.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
8

For such tasks I use library XLConnect. With its functions you can get the names of each sheet in a vector and then just determine the length of that vector.

#Read your workbook 
wb<-loadWorkbook("Your_workbook.xls")

#Save each sheet's name as a vector
lp<-getSheets(wb)

#Now read each sheet as separate list element
dat<-lapply(seq_along(lp),function(i) readWorksheet(wb,sheet=lp[i]))

UPDATE

As suggested by @Martin Studer XLConnect functions are already vectorized, so there is no need to use lapply(), instead just provide vector of sheet names or use function getSheets() inside readWorksheet().

dat <- readWorksheet(wb, sheet = getSheets(wb))
Didzis Elferts
  • 95,661
  • 14
  • 264
  • 201
  • hmm... XLConnect seems to be a bit buggy today. Both `readWorksheet` and `getSheets` are giving me the same error related to `function (classes, fdef, mtable)` Incidentally, I couldnt install it off of Cran and instead installed from source. I will try again in a few days. – Ricardo Saporta Mar 28 '13 at 20:16
  • @RicardoSaporta that's strange. I'm using version 0.2-4 (on Mac OS) and it's working fine for me without problems. – Didzis Elferts Mar 28 '13 at 20:20
  • Looks like something may have changed recently: http://www.r-project.org/nosvn/R.check/r-release-macosx-ix86/XLConnect-00check.html – Ricardo Saporta Mar 28 '13 at 21:05
  • 1
    Note that most of XLConnect's functions are already vectorized. So you could just do the following: `dat = readWorksheet(wb, sheet = getSheets(wb))` – Martin Studer Apr 12 '13 at 20:20