0

I am trying to parse results from duckduckgo search with Net::HTTP and store the links of the results of the array. However the result comes as a string. Any idea how get it any other type of data back as result or how to get the links from the string if there is no option to get the response in another type?

def getlinks(str, num_results)
    uri = URI.parse("https://duckduckgo.com/?q=#{str}")
    response = Net::HTTP.get_response(uri)
end
Amit Kumar Gupta
  • 17,184
  • 7
  • 46
  • 64
  • 1
    See https://duckduckgo.com/api – Holger Just Aug 07 '19 at 21:07
  • Thanks Holger, but is only working for simple queries, for a query 'whytheluckystaff', returning empty result as they don't have proper api solution for search results, etc. this is throwing an empty json: https://api.duckduckgo.com/?q=whytheluckystaff&format=json – Zoli Gera Aug 07 '19 at 21:09
  • DuckDockGo doesn't have then necessary rights to fully allow this (as explained on the site I linked to). As such, using the data you attempt to scrape there will very likely be copyright infringement. – Holger Just Aug 07 '19 at 21:16
  • Well, is for personal educational use, so should not breach any copyright, however wondering if there any solution for my issue. – Zoli Gera Aug 07 '19 at 21:19
  • 3
    Try to use gems like [HTTParty](https://github.com/jnunemaker/httparty) for net/http requests, and [Nokogiri](https://github.com/sparklemotion/nokogiri) in order to parse html response – Kuanish Esenbaev Aug 08 '19 at 04:56

1 Answers1

0

An example with Nokogiri:

require 'nokogiri'

page = Nokogiri::HTML(open("https://duckduckgo.com/?q=#{str}").read)
page.doc.css('a').first.attr(:href)
Kris
  • 19,188
  • 9
  • 91
  • 111