0

I tried to get the last number of pages, but it turns out 0, no matter how I tried. I follow the guidance https://www.datacamp.com/tutorial/r-web-scraping-rvest, but it doesn't work. ` website: https://www.trustpilot.com/review/www.ikea.com

url <-"https://www.trustpilot.com/review/www.ikea.com"

#Now we write a function to get all pages
get_last_page <- function(html){
  
  pages_data <- html %>% 
    # The '.' indicates the class
    html_nodes('.pagination-page') %>% 
    # Extract the raw text as a list
    html_text()                   
  
  # The second to last of the buttons is the one
  pages_data[(length(pages_data)-1)] %>%            
    # Take the raw string
    unname() %>%                                     
    # Convert to number
    as.numeric()                                     
}

#Test the function
first_page <- read_html(url)
(latest_page_number <- get_last_page(first_page))

`

I also tried html_nodes('.pagination_paginationEllipsis__4lfLO') %>%

1 Answers1

0
library(tidyverse)
library(rvest)

get_last_page <- function(html) {
  html %>%
    read_html() %>%
    html_elements(".pagination-link_item__mkuN3") %>%
    html_text2() %>%
    last() %>%
    as.numeric()
}

get_last_page("https://www.trustpilot.com/review/www.ikea.com")

[1] 159
Chamkrai
  • 5,912
  • 1
  • 4
  • 14