I am trying to use extract_tables in tabulizer package.
library(tabulizer)
setwd("directory")
pdf_file <- "filenames.pdf"
cle <- extract_tables(pdf_file, pages=47 ,method="stream", encoding="UTF-8")
what I needed to use extract_table function, is just this code.
However, there is a critical problem. It merges some column automatically
you might understand the situation when you see two images. Column 6 and 7, in pdf table image is merged.
not
0.9000 | -
0.6450 | -
0.7470 | -
the two columns are merged like
0.9000-
0.6450-
0.7470-
So I want to find method do not making table like this, also which is general method.
Therefore I tried to put another component in the function like this.
library(pdftools)
library(tabulizer)
files <- list.files(pattern = "pdf$")
opinions <- lapply(files, pdf_text)
cle <- extract_tables(opinions[[2]][47],method="stream", encoding="UTF-8")
*!Error in normalizePath(path.expand(path), winslash, mustWork) :*
So please leave any solution if you know what I should do about it. thanks.