Questions tagged [google-search]

SEARCH ENGINE OPTIMIZATION(SEO) IS OFF-TOPIC. This tag is only for programming questions about the Google search engine.

Google is the most popular search engine in the world. The Google Web Search API has been deprecated in favor of the new Custom Search.

A Google search may not return answers that might be expected for reasons that include those mentioned in answers and comments to What can you NOT find on Google?:

Google does not even attempt

  • To search for a keywords that are special characters:

"Generally, punctuation is ignored, including @#$%^&*()=+[]\ and other special characters" -Franck Dernoncourt.

The search term double unary works but not --. See also Google displays my website as a spelling error.

Sites with too much content, with content of little value or that are impractical to index

May include:

  • Sites that don't have a crawlable site map and require google to provide search terms to access the results available on the site might not be fully indexed. -Josephine Bonaparte
  • Smaller blogs that aren't regularly updated are often dumped from search results. Plus anything that they think is a splog (“a blog which the author uses to promote affiliated websites” -Wikipedia). -David
  • “Most of the Twitter content is not indexed by Google, even if it’s public.
    It used to be available to Google, but that’s no longer the case since their agreement expired.” -Alex
  • “Google does not index Tumblr all that well.
    Blog posts on Tumblr are easier to find using Tumblr search.” -David
  • “everything on Google Sites isn't (or is hardly) indexed.
    If you start a Google site, get your own domain.” -David

Copyright and other protected material

May include:

  • What the government thinks is not good for you. –Hellagot
    The example give was of Germany “does not show thousands of sites … and the list increases by the thousands every year”.
  • What may infringe intellectual property rights. –einpoklum
    DMCA (Digital Millennium Copyright Act) was mentioned.
  • Census images.
    “Since the content are images that are often manually index, they usually found on paid-for sites like ancestry.com.” –amh

To see which URLs Google has been blocked from crawling, visit the Blocked URLs page of the Crawl section of Webmaster Tools.

Opt outs

  • Content explicitly disallowed by a domain's robots.txt file is excluded from the Google index. -amh

Technical complications

  • Websites that are not linked from other websites that Google already knows (perhaps from when domain was under different ownership – Tim Post). That is, there are probably a lot of websites that do not get linked from visible pages, those websites are never going to be found by the Google spider unless they're manually submitted to Google via the Webmaster Tools. –amh
  • Websites that are behind web forms that you need to fill out. –amh
  • The Deep Web “Most of the Web's information is buried far down on dynamically generated sites, and standard search engines do not find it. Traditional search engines cannot "see" or retrieve content in the deep Web—those pages do not exist until they are created dynamically as the result of a specific search. As of 2001, the deep Web was several orders of magnitude larger than the surface Web.” -Wikipedia
  • May include 408 Billion web pages saved over time according to Wayback Machine. –pnuts
1705 questions
9
votes
2 answers

easiest (legal) way to programmatically get the google search result count?

I want to get the estimated result count for certain Google search engine queries (on the whole web) using Java code. I need to do only very few queries per day, so at first Google Web Search API, though deprecated, seemed good enough (see e.g. How…
Marcus
  • 1,857
  • 4
  • 22
  • 44
8
votes
2 answers

Algorithm behind typo corrections in Google Search

I notice if I make a typo in Google search bar, it is very likely to correct it for me. Like, if I type "incerdible", it will suggest "incredible", or for "stackovflow", it will be "stackoverflow". What is the core idea of such algorithm?
xiaohan2012
  • 9,870
  • 23
  • 67
  • 101
8
votes
2 answers

Making Google direct users to region-specific website or subdomain

I have a good, short website name like mysite.fr. It's a website for my real shop (not an online shop). I have 2 shops. These shops are located in 2 different cities, relatively far away from each other. I could 1) make one website with 2 subdomains…
tmighty
  • 10,734
  • 21
  • 104
  • 218
8
votes
3 answers

How to check Google index count of a url or webpage link

In my project I need this result (Google index count) to be added as an important information. If you have any link from where I can get Google index count of a url, it will be helpful to me.
Tokendra Kumar Sahu
  • 3,524
  • 11
  • 28
  • 29
8
votes
5 answers

Retrieve old searches from Google web history

I want to retrieve old Google searches which I did a few years/months back and that are present in Google web history. How can I programmatically retrieve them all? https://www.google.com/history/?output=rss only provides recent Google searches,…
Pratik
  • 11,534
  • 22
  • 69
  • 99
8
votes
1 answer

How to search for a URL occurrences in Google search results?

I am interested in finding out count of Google search results containing a full URL path. I know we do search for a fully qualified domain or sub-domain. However, have not been able to find a way to search for a path or document. For example: I want…
Ram Iyer
  • 1,404
  • 2
  • 20
  • 27
8
votes
7 answers

SEO: .ca or .com for a Canadian international website

I have both .ca and .com domains of my website. The website is meant for international audience, but it is important to be associated with Canada though. (the website is about Canadian immigration) So the question is should I use .ca domain and 301…
OutFall
  • 482
  • 7
  • 20
8
votes
1 answer

How to promote/open app from Google search website results?

Background When you search on Google's search engine website on an Android device (via Chrome web browser) something like "how to get to X" (where X is a geographic location, like a city name) , the first item that is shown is a card that allows to…
8
votes
2 answers

Getting PageRank

How can I retrieve pagerank of any page indexed by google? Has Google any API or page for this?
oneat
  • 10,778
  • 16
  • 52
  • 70
8
votes
3 answers

How does Google mask the real URLs of links on search results pages?

The following I tested on the latest versions of Chrome and Firefox, and IE11, and the results were the same. If you do a Google search and then mouse over a link on the search results page, the link shown in the bottom-left corner of the browser…
HartleySan
  • 7,404
  • 14
  • 66
  • 119
8
votes
2 answers

iOS Unable to use Books API on google

I am scanning ISBN code and searching the book on the basis of that code on google in my iOS app. I created the app on google.I have keys for browser apps and I also created keys for iOS app. Now here is the API I am…
vntstudy
  • 2,038
  • 20
  • 24
8
votes
3 answers

How to access Google Search "I'm Feeling Lucky" functionality using API?

I'm creating a sample app that will take a query from user and will return the URL result returned from Google's "I'm Feeling Lucky" search. Does Google expose this functionality through their API? How to access this?
muraliv
  • 169
  • 1
  • 7
8
votes
5 answers

Programmatically get Google search results

How can I get Google search results from inside a program? I need to get an array of search results for a specified string.
nonpolynomial237
  • 2,109
  • 4
  • 27
  • 35
8
votes
3 answers

GAE development server keep full text search indexes after restart?

Is there anyway of forcing the GAE dev server to keep full text search indexes after restart? I am finding that the index is lost whenever the dev server is restarted. I am already using a static datastore path when I launch the dev server (the…
7
votes
1 answer

how to correct spelling mistakes in Google custom API

I am using Google's custom search API, I make an HTTP request to a URL that looks like this: https://www.googleapis.com/customsearch/v1?key=&cref=&num=10&q=how+can+i+do+htis if you search for "how can i do htis" on Google you are told…