0

I am setting up a database of certain events that have occurred in the past, and need to search the internet for a number of terms to retrieve as many pages as possible that contain terms related to the happenings i want to document.

First I looked into achieving this using Googles "Custom Search API", after reading this question: Need to access Google Custom search api through R I did manage to get a JSON of search results through the browser, but not through R, so I moved on.

When I saw that the Custom Search API was using OpenSearch, and found the rOpenSearch package for R, I wanted to try going down this path: http://terradue.github.io/rOpenSearch/

After reading through the documentation, there was only provided examples of searching sites that provide opensearch descriptions. As I need to search as many websites as possible, it seems like I would need an opensearch description for a search engine like Google. But I can't seem to find that anywhere.

Is there any way to search the internet via. R using OpenSearch, and collecting the results in a data table?

If you know of a better solution to my problem, I'd appreciate if you could point me in another direction.

Nowak
  • 135
  • 2
  • 14

1 Answers1

0

If I read well, you are looking for something called Web Scraping via R.

<See me!>

  • Web Scraping might be part of my task, to extract the text from each result/source found through google, or preferably, OpenSearch. To my understanding, you need a webpage as a starting point for web scraping. In my case, i don't know which pages to scrape, as I am interested in all webpages that contain a set of keywords. Do you know if web scraping would work without a known website as a starting point? – Nowak Aug 07 '19 at 07:56