I know that the startRow
parameter in the read.xlsx
function in openxlsx
allows me to read a file starting at a designated row. I have to load 300 xlsx files, unfortunately, the number of rows that I want to skip varies between the files. I always want the first row to contain the word "CPT" in the second column. Is there a way to set the startRow
on a text matching parameter? In the picture below, I would set the startRow to 6, but in other cases its 4 or 3.
Asked
Active
Viewed 540 times
0

MatthewR
- 2,660
- 5
- 26
- 37
1 Answers
1
Maybe it is good idea to read twice (1st: read a few rows of col1&2 and get startRow
).
Note: I supposed the cells you want to skip is on col1 (at least).
And with for
, sapply
or etc, you can do it with all files.
library(openxlsx)
tmp <- read.xlsx(file, colNames = FALSE, rows = 1:10, cols = 1:2,
skipEmptyRows = FALSE)
st <- min(grep("CPT", tmp[[2]]))
d <- read.xlsx(file, startRow = st)

cuttlefish44
- 6,586
- 2
- 17
- 34