Questions tagged [search-engine]

A search engine is program that searches documents for specified keywords and returns a list of the documents where the keywords were found.

A search engine is a program that searches documents for specified keywords and returns a list of the documents where the keywords were found.

Although search engine is really a general class of programs, the term is often used to specifically describe systems like Google, Yahoo!, Yandex and Excite that enable users to search for documents on the World Wide Web and USENET newsgroups.

2920 questions

votes

10 answers

What are some Search Servers out there?

I'm looking to find alternatives to Solr from the Apache Software Foundation. For those that don't know, Solr is an enterprise search server. A client application uses a web-services like interface to submit documents for indexing and also to…

asked Sep 15 '08 at 21:01

bpapa

21,409
25
99
147

votes

13 answers

How would you design a good search UI?

I want to provide my users with an 'advanced' search engine. I basically have a lot of search criteria to chose from : some are very simple/common and will be largely use (ie time period, item id) some are a bit less mainstream and some won't be…

user-interface usability search-engine

asked Feb 28 '09 at 19:03

Brann

31,689
32
113
162

votes

2 answers

Recommendable Maven repository search engines?

mavensearch.net doesn't know current versions in many cases, mvnrepository.com is a bit more up to date but doesn't show repositories from where a package can be downloaded, what I would find very useful. What Maven respository search engines do…

java maven-2 repository search-engine

asked Aug 07 '10 at 12:31

deamon

89,107
111
320
448

votes

4 answers

Strategy for how to crawl/index frequently updated webpages?

I'm trying to build a very small, niche search engine, using Nutch to crawl specific sites. Some of the sites are news/blog sites. If I crawl, say, techcrunch.com, and store and index their frontpage or any of their main pages, then within hours my…

web-crawler search-engine

asked Apr 26 '12 at 10:38

OdieO

6,836
7
56
88

votes

7 answers

Google search console fails to fetch sitemaps | "Sitemap could not be read"

I have generated a sitemap from online generators, it seems to be working and even i tested it on old google search console sitemap testor and it works. but when i submit it in both versions it just displays error message.

seo search-engine sitemap google-search-console

asked Dec 25 '18 at 10:49

user9480491

votes

5 answers

An alternative web crawler to Nutch

I'm trying to build a specialised search engine web site that indexes a limited number of web sites. The solution I came up with is: using Nutch as the web crawler, using Solr as the search engine, the front-end and the site logic is coded with…

search-engine web-crawler nutch

asked Nov 24 '10 at 17:24

wassimans

8,382
10
47
58

votes

2 answers

ElasticSearch: search inside the array of objects

I have a problem with querying objects in array. Let's create very simple index, add a type with one field and add one document with array of objects (I use sense console): PUT /test/ PUT /test/test/_mapping { "test": { "properties": { …

elasticsearch search-engine

asked Jun 25 '15 at 23:06

Nikita

4,435
3
24
44

votes

5 answers

Internationalization and Search Engine Optimization

I'd like to internationalize my site such that it's accessible in many languages. The language setting will be detected in the request data automatically, and can be overridden in the user's settings / stored in the session. My question pertains to…

seo internationalization search-engine url-routing

asked Dec 01 '09 at 19:28

Matt Huggins

81,398
36
149
218

votes

5 answers

how to prevent staging to be indexed in search engines

I would like my staging web sites to no being indexed by search engines (Google as first). I have heard Wordpress is good at doing this but I would like to be technology agnostic. Does the robots.txt is enough ? We would like to keep anonymous…

web search-engine robots.txt nofollow

asked Aug 30 '12 at 13:27

toutpt

5,145
5
38
45

votes

4 answers

How does a full text search server like Sphinx work?

Can anyone explain in simple words how a full text server like Sphinx works? In plain SQL, one would use SQL queries like this to search for certain keywords in texts: select * from items where name like '%keyword%'; But in the configuration files…

sql full-text-search search-engine sphinx thinking-sphinx

asked Apr 24 '12 at 09:34

0x4a6f4672

27,297
17
103
140

votes

1 answer

Is it possible to link directly to Google search results using href?

I would like to link directly to a search results page from a standard link. To give an example of what I'm hoping for, here is some pseudocode: Click here to search…

html hyperlink href search-engine

asked Mar 24 '16 at 16:18

Frank

2,050
6
22
40

votes

10 answers

How does a search engine rank millions of pages within 1 second?

I understand the basics of search engine ranking, including the ideas of "reverse index", "vector space model", "cosine similarity", "PageRank", etc. However, when a user submits a popular query term, it is very likely that millions of pages…

sorting search-engine

asked Oct 03 '13 at 14:34

user1036719

1,036
3
15
32

votes

6 answers

SOLR Permissions / Filtering Results depending on Access Rights

For example I have Documents A, B, C. User 1 must only be able to see Documents A, B. User 2 must only be able to see Document C. Is it possible to do it in SOLR without filtering by metadata? If I use metadata filter, everytime there are access…

solr search-engine

asked Feb 10 '12 at 04:47

Manny

6,277
3
31
45

votes

4 answers

Is it possible to control the crawl speed by robots.txt?

We can tell bots to crawl or not to crawl our website in robot.txt. On the other hand, we can control the crawling speed in Google Webmasters (how much Google bot crawls the website). I wonder if it is possible to limit the crawler activities by…

search-engine robots.txt google-crawlers

asked Oct 16 '11 at 20:56

Googlebot

15,159
44
133
229

votes

4 answers

Ruby on Rails, How to determine if a request was made by a robot or search engine spider?

I've Rails apps, that record an IP-address from every request to specific URL, but in my IP database i've found facebook blok IP like 66.220.15.* and Google IP (i suggest it come from bot). Is there any formula to determine an IP from request was…

ruby-on-rails ruby-on-rails-3 search-engine web-crawler

asked May 04 '11 at 10:51

Agung Prasetyo

4,353
5
29
37

Prev 1 2

…

99 100 Next