Questions tagged [google-search]

SEARCH ENGINE OPTIMIZATION(SEO) IS OFF-TOPIC. This tag is only for programming questions about the Google search engine.

Google is the most popular search engine in the world. The Google Web Search API has been deprecated in favor of the new Custom Search.

A Google search may not return answers that might be expected for reasons that include those mentioned in answers and comments to What can you NOT find on Google?:

Google does not even attempt

  • To search for a keywords that are special characters:

"Generally, punctuation is ignored, including @#$%^&*()=+[]\ and other special characters" -Franck Dernoncourt.

The search term double unary works but not --. See also Google displays my website as a spelling error.

Sites with too much content, with content of little value or that are impractical to index

May include:

  • Sites that don't have a crawlable site map and require google to provide search terms to access the results available on the site might not be fully indexed. -Josephine Bonaparte
  • Smaller blogs that aren't regularly updated are often dumped from search results. Plus anything that they think is a splog (“a blog which the author uses to promote affiliated websites” -Wikipedia). -David
  • “Most of the Twitter content is not indexed by Google, even if it’s public.
    It used to be available to Google, but that’s no longer the case since their agreement expired.” -Alex
  • “Google does not index Tumblr all that well.
    Blog posts on Tumblr are easier to find using Tumblr search.” -David
  • “everything on Google Sites isn't (or is hardly) indexed.
    If you start a Google site, get your own domain.” -David

Copyright and other protected material

May include:

  • What the government thinks is not good for you. –Hellagot
    The example give was of Germany “does not show thousands of sites … and the list increases by the thousands every year”.
  • What may infringe intellectual property rights. –einpoklum
    DMCA (Digital Millennium Copyright Act) was mentioned.
  • Census images.
    “Since the content are images that are often manually index, they usually found on paid-for sites like ancestry.com.” –amh

To see which URLs Google has been blocked from crawling, visit the Blocked URLs page of the Crawl section of Webmaster Tools.

Opt outs

  • Content explicitly disallowed by a domain's robots.txt file is excluded from the Google index. -amh

Technical complications

  • Websites that are not linked from other websites that Google already knows (perhaps from when domain was under different ownership – Tim Post). That is, there are probably a lot of websites that do not get linked from visible pages, those websites are never going to be found by the Google spider unless they're manually submitted to Google via the Webmaster Tools. –amh
  • Websites that are behind web forms that you need to fill out. –amh
  • The Deep Web “Most of the Web's information is buried far down on dynamically generated sites, and standard search engines do not find it. Traditional search engines cannot "see" or retrieve content in the deep Web—those pages do not exist until they are created dynamically as the result of a specific search. As of 2001, the deep Web was several orders of magnitude larger than the surface Web.” -Wikipedia
  • May include 408 Billion web pages saved over time according to Wayback Machine. –pnuts
1705 questions
29
votes
4 answers

how to force google to re-index a page

A website I've made has been recently hacked and Google indexed that hacked homepage and now its showing irrelevant text on search result. The hack has been resolved but the search results haven't changed. Is there a way to force Google to re-index…
Dominic Mercier
  • 808
  • 2
  • 9
  • 17
29
votes
1 answer

Does JSON-LD have to be embedded?

We are currently using the Microdata format to expose data to search engines and we are looking at exposing more info to be able to support some more advanced Google Search features. As I'm working my way through the fields I'm finding I need…
Ryan B
  • 681
  • 7
  • 14
29
votes
5 answers

How can I add a Google search box to my website?

I am trying to add a Google search box to my own website. I would like it to search Google itself, not my site. There was some code I had that use to work, but no longer does:
wahle509
  • 666
  • 3
  • 9
  • 22
26
votes
8 answers

Does google index pages with hidden divs?

I am starting to redesign and develop a site that contains a lot of text and I am thinking of ways to organize the information on the site so that it looks cleaner. On some parts of the site I would like to implement a jquery toggle effect where…
Yin
  • 371
  • 1
  • 4
  • 9
24
votes
3 answers

List of JSON search engine APIs without quotas, like Bing?

I'd like to display some custom search results. I've looked at the JSON APIs of both Google and Microsoft (Bing). Unfortunately, Google has a limit on the amount of queries a day ($50 for a maximum of ten thousand queries). However, Bing allows an…
Tom
  • 8,536
  • 31
  • 133
  • 232
22
votes
7 answers

Google Search Web Scraping with Python

I've been learning a lot of python lately to work on some projects at work. Currently I need to do some web scraping with google search results. I found several sites that demonstrated how to use ajax google api to search, however after attempting…
pbell
  • 265
  • 1
  • 4
  • 7
22
votes
2 answers

Force Google searches to not return results without the search terms

So I've sometimes searched Google for certain pages and Google will actually return pages that don't have my searched terms. For example, if I search for analytic proof dihedral homomorphism (I don't currently actually want to search for this, it's…
Addem
  • 3,635
  • 3
  • 35
  • 58
21
votes
4 answers

Why does Google Search return HTTP Error 403?

Consider the following Python code: 30 url = "http://www.google.com/search?hl=en&safe=off&q=Monkey" 31 url_object = urllib.request.urlopen(url); 32 print(url_object.read()); When this is run, an Exception is thrown: File…
AgentLiquid
  • 3,632
  • 7
  • 26
  • 30
19
votes
1 answer

Register new game for YouTube Gaming

We're developers of mobile device games. I'm looking for solution how to add our game into list of games on youtube. We would like to use name of the game below video on youtube. I have read many posts and articles about that, but I stuck on issue…
17
votes
2 answers

How to make google search results default to python3 docs

I like to use google when I'm searching for documentation on things related to python. Many times what I am looking for turns out to be in the official python documentation on docs.python.org. Unfortunately, at time of writing, the docs for the…
Michael Hewson
  • 1,444
  • 13
  • 21
17
votes
4 answers

API alternative to Google trends

Is there any API that I can use to rank search terms according to their popularity? An official Google API was announced to be released, however there isn't any. Any suggestions on what I may use alternatively?
dungeon_master
  • 191
  • 1
  • 1
  • 8
16
votes
2 answers

Meaning of parameters in a Google search query?

Are there any resources on what the parameters in a Google query mean? Any analysis how the Google search pages work internally? Examples would…
blinry
  • 4,746
  • 4
  • 26
  • 31
15
votes
4 answers

How to disable AMP caching from Google Search?

Some results on Google Search comes with AMP (Accelerated Mobile Pages) icon on theirs links, at least when using a mobile, as soon you click on the link instead of loading the site, google show you a cached version of it rather. I want to disable…
Tiago Pimenta
  • 736
  • 7
  • 20
15
votes
1 answer

Get first image from google

I want to display the first image on google search in my website against a keyword. I would really like some pointers in this direction. Thank you!
darthsidious
  • 2,851
  • 3
  • 19
  • 30
15
votes
2 answers

What algorithm does google use to make Chrome browser's address bar to act as a default search bar for many websites?

I am wondering what algorithm does google use to make chrome browser's address bar to act as a default search bar for many websites like SO, Quroa etc. but not for facebook, metastackoverflow etc.. For example if you want to search for a topic in…
user1518659
  • 2,198
  • 9
  • 29
  • 40