I’m a beginner in web scraping and trying to learn how to implement an automated process to collect data from the web submitting search terms.
The specific problem I’m working on is as follows:
Given the stackoverflow webpage https://stackoverflow.com/ I submit a search for the term “web scraping” and want to collect in a list all question links and the content for each question.
Is it possible to scrape these results?
My plan is to create a list of terms:
term <- c(“web scraping”, “crawler”, “web spider”)
submit a research for each term and collect both question title and content of the question.
Of course the process should be repeated for each pages of results.
Unfortunately, being relatively new to web scraping, I'm not sure what to do. I’ve already downloaded some packages to scrape the web (rvest, RCurl, XML, RCrawler).
Thanks for your help