I am trying to extract some reviews from a secure webpage as below:
# Attempt to extract information from a online secure page
library(rvest)
URL <- "https://www.bankbazaar.com/insurance/religare-health-insurance.html"
mainPage <- read_html(URL)
reviewsHTML <- html_nodes(mainPage, ".ellipsis_text")
reviewsHTML
Above codes give me output as {xml_nodeset (0)}. But when I save that webpage (using ctrl + S) in my local system first as "Religare Health Insurance.html" and then try to extract the reviews, I am able to extract the reviews.
# Attempt to extract information from a offline secure page
library(rvest)
URL <- "Religare Health Insurance.html"
mainPage <- read_html(URL)
reviewsHTML <- html_nodes(mainPage, ".ellipsis_text")
reviewsHTML
{xml_nodeset (20)}
[1] <span itemprop="description" class="ellipsis_text">I have taken my health insurance from Religare......
Questions:
- Why there is a different behavior when I try to extract the information from the same online and offline page?
- How can we use R, to extract the same information without downloading the page?