7

When I do a site-specific search on google.com:

site:http://one-month-of-chat-logs.github.io security

I get 12 results. I signed up for a custom search engine (cx: 015271449006306103053:mz6wkimeenc) and API key, and I get only 3 results when I run the same search:

$ curl 'https://www.googleapis.com/customsearch/v1?key=$MY_API_KEY&cx=015271449006306103053%3Amz6wkimeenc&q=security'

Why do the results differ? Is my API request actually querying something different than the search I performed on google.com?

aaronstacy
  • 6,189
  • 13
  • 59
  • 72
  • 2
    Did you ever find a way to overcome this problem? I am using the Google API search and I would really like to at least get results close to the actual ones. Please let me know. – Erick Jan 26 '16 at 14:45
  • I am also interested in getting more complete search results similar to the main google search... Anyone found out a better way? – user32882 Jan 10 '19 at 10:47

2 Answers2

7

This google page has what you are looking for https://support.google.com/customsearch/answer/70392?hl=en

your results are unlikely to match those returned by Google Web Search, for several reasons:

  1. Even if a custom search engine is configured to search the entire web, it’s designed to emphasize results from your own sites.
  2. Your custom search engine doesn’t include Google Web Search features such as Oneboxes, real-time results, universal search, social feaures, or personalized results.
  3. If your custom search engine includes more than ten sites, the results may be from a subset of our index and may differ from the results of a 'site:' search on Google.com.
Michael
  • 423
  • 2
  • 8
  • my custom search engine is only for one site (http://one-month-of-chat-logs.github.io), so none of these explanations apply. after looking at the setup page for my custom search engine, i noticed it says "Approximate number of pages indexed by Google Search: 25,000", but my site has ~43k small pages. does this mean the cse won't index my entire site? – aaronstacy Nov 25 '13 at 00:18
  • 1
    I did a quick "site:http://one-month-of-chat-logs.github.io" search and got 25k so Google has yet to index the remaining 20k. All pages that are available on Google.com are also available to your search engine, but not necessarily the other way round. Try https://support.google.com/customsearch/answer/94097?hl=en – Michael Nov 25 '13 at 00:27
  • thanks. so i searched for that too: https://encrypted.google.com/#hl=en&q=site%3Aone-month-of-chat-logs.github.io and it's telling me "page 1 of about 3,100 results" at the top just above the result list. where did you see the 25k? – aaronstacy Nov 25 '13 at 01:32
  • 1
    oh that is weird I could have sworn I saw 25300 results at first, now I am getting 3100 results too. I found something: when I search "inurl:one-month-of-chat-logs.github.io", I get 11k results, but after paging through google will suddenly tell me there are only 29 results. I don't know what is going on here. – Michael Nov 25 '13 at 01:55
  • 1
    actually it was because of this message: `In order to show you the most relevant results, we have omitted some entries very similar to the 29 already displayed. If you like, you can repeat the search with the omitted results included.` – Michael Nov 25 '13 at 01:58
  • 1
    but if you do inurl, you will get 11k+. site only returns me 3100 now but I had 25300+ at first. – Michael Nov 25 '13 at 01:58
  • that is strange, now it's telling me 25k. looks like it might be something to do w/ google removing similar results like you said. i think i'll just need to wait for google webmaster tools to tell me they've indexed everything. thanks for your help! – aaronstacy Nov 25 '13 at 14:19
  • @Michael do you know any alternative with real-time results? paid service is fine. – ihsan Dec 06 '17 at 01:34
0

I found that it is impossible to get the right results using Google APIs. Even if the search is only for one website, their search results are different if you use their UI vs use the API and pay for it. This is I guess, because google makes more money if they can show ads, while APIs are definitely only a face saving measure.

Since some of you are ok with a paid solution(@ihsan) you can try using a third party service like https://www.expertrec.com where you can control your crawl (so crawl depth is not a problem), ranking (adjust it the way you like), use the API or the full solution, with out any ads.

melchi
  • 627
  • 6
  • 10