Questions tagged [search-engine]

A search engine is program that searches documents for specified keywords and returns a list of the documents where the keywords were found.

A search engine is a program that searches documents for specified keywords and returns a list of the documents where the keywords were found.

Although search engine is really a general class of programs, the term is often used to specifically describe systems like Google, Yahoo!, Yandex and Excite that enable users to search for documents on the World Wide Web and USENET newsgroups.

2920 questions
32
votes
5 answers

Improving search result using Levenshtein distance in Java

I have the following working Java code for searching for a word against a list of words and it works perfectly and as expected: public class Levenshtein { private int[][] wordMartix; public Set similarExists(String searchWord) { …
Maytham Fahmi
  • 31,138
  • 14
  • 118
  • 137
31
votes
4 answers

Use of indexes for multi-word queries in full-text search (e.g. web search)

I understand that a fundamental aspect of full-text search is the use of inverted indexes. So, with an inverted index a one-word query becomes trivial to answer. Assuming the index is structured like this: some-word -> [doc385, doc211, doc39977,…
31
votes
2 answers

Elasticsearch - How to normalize score when combining regular query and function_score?

Idealy what I am trying to achieve is to assign weights to queries such that query1 constitutes 30% of the final score and query2 consitutes other 70%, so to achieve the maximum score a document has to have highest possible score on query1 and…
JohnnyM
  • 1,273
  • 1
  • 13
  • 26
30
votes
7 answers

Search engine solution for Django that actually works?

The story so far: Decided to go with Xapian as search backend because it has all search-engine features I was looking for, knows about Unicode, stemming, has few dependencies and requires no bloated app-server installation on top of it. Tried Django…
nikola
  • 2,241
  • 4
  • 30
  • 42
29
votes
9 answers

Which are the best alternatives to Lucene?

It may run on Unix and it will be used for email searching (Dovecot, Postfix and maildir). Lucene is not a problem, I'm just analyzing some alternatives.
Rui Carneiro
  • 5,595
  • 5
  • 33
  • 39
26
votes
10 answers

How to download google image search results in Python

This question has been asked numerous times before, but all answers are at least a couple years old and currently based on the ajax.googleapis.com API, which is no longer supported. Does anyone know of another way? I'm trying to download a hundred…
xanderflood
  • 826
  • 2
  • 12
  • 22
26
votes
4 answers

Marking up a search result list with HTML5 semantics

Making a search result list (like in Google) is not very hard, if you just need something that works. Now, however, I want to do it with perfection, using the benefits of HTML5 semantics. The goal is to define the defacto way of marking up a search…
Johan
  • 5,003
  • 3
  • 36
  • 50
26
votes
3 answers

Connect to SphinxQL through Linux command-line

I am trying to connect to SphinxQL server through Linux command-line this way: > mysql -P 9306 ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO) My Sphinx config file has 2 listen entries: listen =…
snippetsofcode
  • 937
  • 2
  • 10
  • 10
25
votes
6 answers

Is there any free unlimited album artwork search API service?

Google's custom search API has a limitation up to 100 queries per day. That is far less than what I expected. I want to add that artwork-search function to my app. Thanks a lot.
Li Fumin
  • 1,383
  • 2
  • 15
  • 31
24
votes
3 answers

List of JSON search engine APIs without quotas, like Bing?

I'd like to display some custom search results. I've looked at the JSON APIs of both Google and Microsoft (Bing). Unfortunately, Google has a limit on the amount of queries a day ($50 for a maximum of ten thousand queries). However, Bing allows an…
Tom
  • 8,536
  • 31
  • 133
  • 232
23
votes
9 answers

Google-like Search Engine in PHP/mySQL

We have OCRed thousands of pages of newspaper articles. The newspaper, issue, date, page number and OCRed text of each page has been put into a mySQL database. We now want to build a Google-like search engine in PHP to find the pages given a query.…
lkessler
  • 19,819
  • 36
  • 132
  • 203
22
votes
5 answers

Can search engine spiders see content I add using jQuery?

I currently have something like this

Will search engines be able to spider the "hey" text? and if yes, what…
Matthew Hui
  • 3,321
  • 2
  • 27
  • 38
22
votes
5 answers

How do I do a partial field match using Haystack?

I needed a simple search tool for my django-powered web site, so I went with Haystack and Solr. I have set everything up correctly and can find the correct search results when I type in the exact phrase, but I can't get any results when typing in a…
Ben S
  • 1,407
  • 1
  • 13
  • 27
22
votes
11 answers

What's a good source code search engine?

The codebase I work on is huge, and grepping it takes about 20 minutes. I'm looking for a good web-based source code search engine.. something like an intranet version of koders.com. The only thing I've found is Krugle Enterprise Edition, which…
toohool
  • 1,067
  • 1
  • 9
  • 13
21
votes
8 answers

Can search engines index JavaScript generated web pages?

Can search engines such as Google index JavaScript generated web pages? When you right click and select view source in a page that is generated by JavaScript (e.g using GWT) you do not see the dynamically generated HTML. I suppose that if a search…
Roy
  • 318
  • 3
  • 8