I'm trying to web scrape accounting results from publicly listed companies using R via RSelenium package, from Brazilian Exchange Comission (CVM) link below: https://www.rad.cvm.gov.br/ENET/frmGerenciaPaginaFRE.aspx?NumeroSequencialDocumento=108692&CodigoTipoInstituicao=1
##Carregar o pacote da biblioteca
library(RSelenium)
library(glue)
library(tidyverse)
library(dplyr)
#Cria o servidor
rD <- rsDriver(port = 4569L,
##Define a versão do Chrome que o Webdriver deve utilizar
chromever = '94.0.4606.61',
##Remove as informações do console
verbose = F)
#Cria o driver para usar o R
remDr <- remoteDriver(
remoteServerAddr = "localhost",
port = 4569L,
browserName = "chrome"
)
#Abre o servidor
remDr$open()
remDr$navigate("https://www.rad.cvm.gov.br/ENET/
frmGerenciaPaginaFRE.aspx?NumeroSequencialDocumento=108692&CodigoTipoInstituicao=1")
#Seleciona o tipo de demonstrativo
zelecti <- remDr$findElement(using = "id",
value = "cmbGrupo")
zelecti2 <- zelecti$findChildElements(using = "xpath",
value = "//option")
zelecti2[[3]]$clickElement()
zelecti2[[12]]$clickElement()
I run this code segment and it works fine. However, when I try to capture the first row of the table using Xpath, RSelenium cannot find the row element and returns an empty list from row object.
# Capturar Tabela
tabela <-remDr$findElement(using = 'id', value = "iFrameFormulariosFilho")
# Capturar linha 1
row <-tabela$findChildElements(using = 'xpath', value = "//tbody//tr[2]")
# Capturar conteúdo da linha
row_content <- sapply(row, function(x) x$getElementText())
It seems that the iframe value specification is referring to the correct object, as this line command returns no error an as I have checked it on Chrome console. I have also tried to capture other elements also related to the table, but RSelenium still cannot read them.
Is there any other way I can capture table content from this link?