0

I have a column of ~8000 URLs from which I would like to scrape information. Not all the URLs work, so when I go to scrape the information using a for loop, R returns an error message saying "closing unused connection # (https://...)" and stops the scraping loop.

I was thinking of using a tryCatch to test whether the URLs work or not, but I'm not certain of how to use the syntax with rvest and creating a new column of true/false values. I want it to test each URL, and if it works, leave a 1 in a new column, and if it fails, leave a 0 in the new column, but continue to the next line regardless. How do I go about writing this code in R?

Thank you.

Magnus
  • 1
  • you could ping: `urls <- c('google.com', 'g00gle.com'); +(sapply(urls, function(x) system(sprintf('ping %s -c1', x))) == 0)` – rawr Jun 30 '20 at 21:03
  • here's a similar question https://stackoverflow.com/questions/7012796/ping-a-website-in-r – rawr Jun 30 '20 at 21:04

0 Answers0