0

Is there a way to import a named Excel-table into R as a data.frame?

I typically have several named Excel-tables on a single worksheet, that I want to import as data.frames, without relying on static row - and column references for the location of the Excel-tables.

I have tried to set namedRegion which is an available argument for several Excel-import functions, but that does not seem to work for named Excel-tables. I am currently using the openxlxs package, which has a function getTables() that creates a variable with Excel-table names from a single worksheet, but not the data in the tables.

MelkorNO
  • 3
  • 4

4 Answers4

1

To get your named table is a little bit of work.

First you need to load the workbook.

library(openxlsx)

wb <- loadWorkbook("name_excel_file.xlsx")

Next you need to extract the name of your named table.

# get the name and the range
tables <- getTables(wb = wb,
                    sheet = 1)

If you have multiple named tables they are all in tables. My named table is called Table1.

Next you to extract the column numbers and row numbers, which you will later use to extract the named table from the Excel file.

# get the range
table_range <- names(tables[tables == "Table1"])
table_range_refs <- strsplit(table_range, ":")[[1]]

# use a regex to extract out the row numbers
table_range_row_num <- gsub("[^0-9.]", "", table_range_refs)
# extract out the column numbers
table_range_col_num <- convertFromExcelRef(table_range_refs)

Now you re-read the Excel file with the cols and rows parameter.

# finally read it
my_df <- read.xlsx(xlsxFile = "name_excel_file.xlsx",
                   sheet = 1,
                   cols = table_range_col_num[1]:table_range_col_num[2],
                   rows = table_range_row_num[1]:table_range_row_num[2])

You end up with a data frame with only the content of your named table.

I used this a while ago. I found this code somewhere, but I don't know anymore from where.

ricoderks
  • 1,619
  • 9
  • 13
1

This link is might be useful for you

https://stackoverflow.com/a/17709204/10235327

1. Install XLConnect package
2. Save a path to your file in a variable
3. Load workbook
4. Save your data to df

To get table names you can use function

getTables(wb,sheet=1,simplify=T)

Where:

  • wb - your workbook
  • sheet - sheet name or might be the number as well
  • simplify = TRUE (default) the result is simplified to a vector

https://rdrr.io/cran/XLConnect/man/getTables-methods.html

Here's the code (not mine, copied from the topic above, just a bit modified)

require(XLConnect)
sampleFile = "C:/Users/your.name/Documents/test.xlsx"
wb = loadWorkbook(sampleFile)
myTable <- getTables(wb,sheet=1)
df<-readTable(wb, sheet = 1, table = myTable)
AlexDinahl
  • 31
  • 5
  • Thanks, this does solve my issue although XLConnect package relies on Java, which we are not allowed to install on company PCs. :/ – MelkorNO Jul 08 '20 at 16:36
  • You're welcome! Yes, all these 'not allowed to install' relates to my work as well :) But, maybe they can provide you with remote PC where it is possible to use java. – AlexDinahl Jul 09 '20 at 13:25
0

You can check next packages:

library(xlsx)
Data <- read.xlsx('YourFile.xlsx',sheet=1)

library(readxl)
Data <- read_excel('YourFile.xlsx',sheet=1)

Both options allow you to define specific regions to load the data into R.

Duck
  • 39,058
  • 13
  • 42
  • 84
0

I use read.xlsx from package openxlsx. For example:

library(openxlsx)
fileA <- paste0(some.directory,'excel.file.xlsx')
A <-  read.xlsx(fileA, startRow = 3)

hope it helps

efz
  • 425
  • 4
  • 9