Using the rvest package, I am trying to scrape names of actors/actresses from IMDB page for the film JFK (https://www.imdb.com/title/tt0102138/fullcredits?ref_=tt_ql_1).
SelectorGadget says that the place I want to look to find the names is "td:nth-child(2)" for every person.
Here's the code I'm using.
library(rvest)
library(stringr)
startFilm <- "tt0102138" #JFK
personsNames <- c()
pagePath <- paste("https://www.imdb.com/title/", startFilm, "/?ref_=nv_sr_1?ref_=nv_sr_1", sep = "")
moviePage <- read_html(pagePath)
personNodes <- html_nodes(moviePage, "td:nth-child(2)")
personText <- html_text(personNodes)
for (i in 1:length(personText)){
actor <- (unlist(str_split(personText[i], "\n")))[2]
personsNames[i] <- substring(actor, 2, nchar(actor))
}
personsNames
According to the website at https://www.imdb.com/title/tt0102138/fullcredits?ref_=tt_ql_1 this list should be fairly long.
Yet when I run the code I only get back 15 names.
[1] "Sally Kirkland" "Anthony Ramirez" "Ray LePere" "Steve Reed" "Jodie Farber" "Columbia Dubose"
[7] "Randy Means" "Kevin Costner" "Jay O. Sanders" "E.J. Morris" "Cheryl Penland" "Jim Gough"
[13] "Perry R. Russo" "Mike Longman" "Edward Asner"
Why is the list of names truncated?
How should I adjust my code to get the full list of actors/actresses in the film?