Questions tagged [google-search]

SEARCH ENGINE OPTIMIZATION(SEO) IS OFF-TOPIC. This tag is only for programming questions about the Google search engine.

Google is the most popular search engine in the world. The Google Web Search API has been deprecated in favor of the new Custom Search.

A Google search may not return answers that might be expected for reasons that include those mentioned in answers and comments to What can you NOT find on Google?:

Google does not even attempt

  • To search for a keywords that are special characters:

"Generally, punctuation is ignored, including @#$%^&*()=+[]\ and other special characters" -Franck Dernoncourt.

The search term double unary works but not --. See also Google displays my website as a spelling error.

Sites with too much content, with content of little value or that are impractical to index

May include:

  • Sites that don't have a crawlable site map and require google to provide search terms to access the results available on the site might not be fully indexed. -Josephine Bonaparte
  • Smaller blogs that aren't regularly updated are often dumped from search results. Plus anything that they think is a splog (“a blog which the author uses to promote affiliated websites” -Wikipedia). -David
  • “Most of the Twitter content is not indexed by Google, even if it’s public.
    It used to be available to Google, but that’s no longer the case since their agreement expired.” -Alex
  • “Google does not index Tumblr all that well.
    Blog posts on Tumblr are easier to find using Tumblr search.” -David
  • “everything on Google Sites isn't (or is hardly) indexed.
    If you start a Google site, get your own domain.” -David

Copyright and other protected material

May include:

  • What the government thinks is not good for you. –Hellagot
    The example give was of Germany “does not show thousands of sites … and the list increases by the thousands every year”.
  • What may infringe intellectual property rights. –einpoklum
    DMCA (Digital Millennium Copyright Act) was mentioned.
  • Census images.
    “Since the content are images that are often manually index, they usually found on paid-for sites like ancestry.com.” –amh

To see which URLs Google has been blocked from crawling, visit the Blocked URLs page of the Crawl section of Webmaster Tools.

Opt outs

  • Content explicitly disallowed by a domain's robots.txt file is excluded from the Google index. -amh

Technical complications

  • Websites that are not linked from other websites that Google already knows (perhaps from when domain was under different ownership – Tim Post). That is, there are probably a lot of websites that do not get linked from visible pages, those websites are never going to be found by the Google spider unless they're manually submitted to Google via the Webmaster Tools. –amh
  • Websites that are behind web forms that you need to fill out. –amh
  • The Deep Web “Most of the Web's information is buried far down on dynamically generated sites, and standard search engines do not find it. Traditional search engines cannot "see" or retrieve content in the deep Web—those pages do not exist until they are created dynamically as the result of a specific search. As of 2001, the deep Web was several orders of magnitude larger than the surface Web.” -Wikipedia
  • May include 408 Billion web pages saved over time according to Wayback Machine. –pnuts
1705 questions
12
votes
4 answers

How to get the referer search query from google?

As recently as two days ago, the following code worked to get the search query from google: $refer = parse_url($_SERVER['HTTP_REFERER']); $host = parse_url($_SERVER['HTTP_REFERER'], PHP_URL_HOST); $query = parse_url($_SERVER['HTTP_REFERER'],…
mattmattmatt
  • 965
  • 3
  • 15
  • 29
11
votes
0 answers

programmatically access google search watchlist

Google offers you the possibility with Google search to track, like and rate movies or TV shows. You can see those options blended with the information of a given movie when searching for it. You then can have access to that data later by searching…
Joel Barenco
  • 323
  • 1
  • 11
11
votes
5 answers

Getting Good Google PageRank

In SEO people talk a lot about Google PageRank. It's kind of a catch 22 because until your site is actually big and you don't really need search engines as much, it's unlikely that big sites will link to you and increase your PageRank! I've been…
Tyler
  • 3,220
  • 1
  • 30
  • 44
11
votes
4 answers

Google 404 soft error on index page that is working fine

A friend of mine has been having trouble getting her site indexed by google and asked me to have a look, but that is not something I really know much about and was hoping for some assistance. Looking at her search console, google crawl shows an…
11
votes
1 answer

How add my application content with description and preview to Google search results?

I found this article: A new way to search for content in your apps and I'm really excited for this opportunity. I want to show my application content in google search results, like this: But this article doesn't have any information about how to…
ArtKorchagin
  • 4,801
  • 13
  • 42
  • 58
11
votes
1 answer

Can i override Google SERP with automatic IP location determination programmatically?

I mean, i can specify preferred SERP location search options on google serp page, for sure. I guess there is way to make it hapenns with curl. Here is some history. I've used SEOStats onto my project. It's good. The I've got interesting article…
Andrew Rumm
  • 1,268
  • 4
  • 16
  • 39
10
votes
3 answers

Google Search API site limit

According to the Google custom search API's docs: http://code.google.com/apis/customsearch/docs/start.html#sites there is a limit of up to 5000 sites that you can search. This is pretty lame. Is there any way around this so that I can search the…
Justin Meltzer
  • 13,318
  • 32
  • 117
  • 182
10
votes
1 answer

Google for Jobs Location of job showing up is incorrect (Google is using our Company HQ)

I am having an issue with multiple Job Postings on our website. We offer jobs in multiple location across Canada. All job pages contain an "LD+JSON" structured data for a jobPosting, based on Google's documentation for JobPostings: …
JFTxJ
  • 542
  • 6
  • 17
10
votes
2 answers

How does Google count and estimate the number of a search results?

How does Google count and estimate the number of a search results? For example when I search "stackoverflow," it counts 2,910,000 results.
jozi
  • 2,833
  • 6
  • 28
  • 41
10
votes
1 answer

What language was Google written in at the beginning?

I wonder what language Larry Page and Sergey Brin wrote Google with? I'm not talking about the languages that are used today, but about the version they launched from their dorm.
never_had_a_name
  • 90,630
  • 105
  • 267
  • 383
10
votes
4 answers

Is there an API available for Google's Related Search Queries?

When I search for Clooney at google.com a number of related search queries are suggested at the bottom: Searches related to clooney clooney movies clooney girlfriend rosemary clooney nick clooney clooney oscar obama clooney clooney fundraiser betty…
Tom
  • 8,536
  • 31
  • 133
  • 232
10
votes
5 answers

How to construct complex Google Web Search query?

Searching through the Web by using the Google search engine is a de facto standard for Internet users. Google provides a basic or an advanced form to prepare a query string to its search engine. Supposing to be interested in not using the web form,…
JeanValjean
  • 17,172
  • 23
  • 113
  • 157
9
votes
1 answer

How to configure robots.txt file to block all but 2 directories

I don't want any search search engines to index most of my website. I do however want search engines to index 2 folders ( and their children ). This is what I set up, but I don't think it works, I see pages in Google that I wanted to hide: Here's…
jeph perro
  • 6,242
  • 26
  • 90
  • 124
9
votes
1 answer

Retrieving an entire website using Google Cache?

There is a site that I want to retrieve from Google Cache that had thousands of pages. Is there any way I can get it back quickly using Google Cache or some other web crawler/archiver?
stockoverflow
  • 1,457
  • 5
  • 18
  • 22
9
votes
2 answers

How to search Google by specific URL?

I want to search for results from a specific site but only from a specific section of the site, I want to exclude results that have url segments, for example: What I want to search for: http://domain.com/productx What I want to…
condo1234
  • 3,285
  • 6
  • 25
  • 34