Unable to parse a difficult to understand html file in r

Question

It's been a while since I visited stackoverflow, I have a problem with parsing a html file. I am trying to parse the following link

edata <- read_html("https://mmiconnect.in/app/ep-2022/registration/show-catalogue")

But I am not able to parse the html file using html_nodes, I tried all possible class names, but for no result.

I am trying to get all the company names, that participated in the EXPO, I tried various "class",

html_nodes('.fuse-widget-front .mat-elevation-z4 .m-2 .bg-white')

But for any results.

What is happening? What exactly trying to get? – QHarr Sep 24 '22 at 20:52 — QHarr, Sep 24 '22 at 20:52

score 1 · Answer 1 · answered Sep 24 '22 at 23:51

I have been able to parse the html with the following code :

library(RSelenium)
library(rvest)
url <- "https://mmiconnect.in/app/ep-2022/registration/show-catalogue"
shell('docker run -d -p 4445:4444 selenium/standalone-firefox')
remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4445L, browserName = "firefox")
remDr$open()
remDr$navigate(url)
htmltxt <- remDr$getPageSource()[[1]]
read_html(htmltxt) %>% html_node(xpath = '//*/img') %>% html_attr('src')

[1] "https://mmiconnectstorage.azureedge.net/global-manual-upload/ep-2022-visitor-reg-banner.jpg"

Unable to parse a difficult to understand html file in r

1 Answers1