Questions tagged [lucene]

The term Lucene refers to the open source Java fulltext search engine library, but also to the entire eco-system that grew around it, including lucene.net, solr, elasticsearch and zend-search-lucene.

The term "Lucene" refers to the open source Java fulltext search engine library, and also to the entire eco-system that grew around it, including , , and . "Lucene" may also be used to refer to top-level projects like Nutch and Tika which were once sub-projects of Lucene.

Use the "Lucene" tag if either:

  • The question is about the Java library
  • The question is about a port of the library, but would make sense to people who know the Java library (many Lucene.NET questions match this criteria).
  • The question is so general it doesn't apply to a specific implementation (example).

References:

Basic Demo:

A basic "getting started" demo showing how to build and query an index is provided as part of the official documentation:

Basic Demo documentation - (this link is for Lucene v8.7.0. Newer versions may be available)

Links to the demo's source files are provided in the above documentation.

The source code can also be found here on GitHub.

Luke - a Lucene GUI Client:

Luke is a GUI client application which can be used to explore your Lucene indexes. Recent versions of Luke are now provided as part of each binary release, which can be downloaded from here.

After downloading the binary release, unzip it, and go to the luke directory. Launch the client using the provided luke.bat or luke.sh scripts.

11993 questions
20
votes
7 answers

How to create new core in Solr 5?

Currently we are using Apache Solr 4.10.3 OR Heliosearch Distribution for Solr [HDS] as a search engine to index our data. Now after that, I got the news about Apache Solr 5.0.0 release in last month. I'd successfully installed Apache Solr 5.0.0…
immayankmodi
  • 8,210
  • 9
  • 38
  • 55
20
votes
2 answers

Solr Text field and String field - different search behaviour

I am working on Solr 4+. I have several fields into my solr schema with different solr field types. Does the search on text field and string field differs? Because I am trying to search on string field (which is a copy field of few facet fields)…
Ankita
  • 1,416
  • 4
  • 17
  • 42
20
votes
4 answers

Show contents of Lucene index

I am trying to debug indexing documents in Lucene, and I need to see the contents of the index so I can see exactly how the documents got indexed. Allegedly Luke does this, but there is no documentation for it whatsoever, and when I point it at the…
cbmanica
  • 3,502
  • 7
  • 36
  • 54
19
votes
4 answers

Security (aka Permissions) and Lucene - How ? Should it be done?

First some background to my question. Individual entities can have read Permissions. If a user fails a read permission check they cant see that instance. The probelm relates to introducing Lucene and performing a search which simply returns a list…
mP.
  • 18,002
  • 10
  • 71
  • 105
19
votes
2 answers

Mimic Elasticsearch MatchQuery

I'm currently writing a program that currently uses elasticsearch as a back-end database/search index. I'd like to mimic the functionality of the /_search endpoint, which currently uses a match query: { "query": { "match" : { …
Blue
  • 22,608
  • 7
  • 62
  • 92
19
votes
1 answer

What's the difference between query_string and multi_match?

When running this queries: { "query_string" : { "query" : "text", "fields": ["field1", "field2"] } } - { "multi_match" : { "query" : "text", "fields": ["field1", "field2"] } } What is the difference? When to use one and…
Félix Sanz
  • 1,812
  • 4
  • 16
  • 27
19
votes
1 answer

Paging Lucene's search results

I am using Lucene to show search results in a web application.I am also custom paging for showing the same. Search results could vary from 5000 to 10000 or more. Can someone please tell me the best strategy for paging and caching the search results?
user41625
  • 307
  • 1
  • 3
  • 8
19
votes
2 answers

Build a Kibana Histogram with buckets dynamically created by ElasticSearch terms aggregation

I want to be able to combine the functionality of the Kibana Terms Graph (be able to create buckets based on uniqueness of values from a particular attribute) and Histogram Graph (separate data into buckets based on queries and then illustrate the…
ecbrodie
  • 11,246
  • 21
  • 71
  • 120
19
votes
1 answer

Lucene's algorithm

I read the paper by Doug Cutting; "Space optimizations for total ranking". Since it was written a long time ago, I wonder what algorithms lucene uses (regarding postings list traversal and score calculation, ranking). Particularly, the total ranking…
19
votes
2 answers

Lucene: exception - Query parser encountered after "some word"

I am working on a classification problem to classify product reviews as positive, negative or neutral as per the training data using Lucene API. I am using an ArrayList of Review objects - "reviewList" that stores the attributes for each review…
Reema
  • 1,147
  • 1
  • 9
  • 11
18
votes
3 answers

How to get all documents of lucene index?

I have created a lucene index . I would like to get all documents that just according to a field sorting and no search terms!
wudan
  • 183
  • 1
  • 1
  • 5
18
votes
2 answers

Howto perform a 'contains' search rather than 'starts with' using Lucene.Net

We use Lucene.NET to implement a full text search on a clients website. The search itself works already but we now want to implement a modification. Currently all terms get appended a * which leads Lucene to perform what I would classify as a…
ntziolis
  • 10,091
  • 1
  • 34
  • 50
18
votes
1 answer

What makes a good autowarming query in Solr and how do they work?

This question is a follow up to this question about infrequent, isolated read timeouts in a solr installation. As a possible problem missing / bad autowarming queries for new searchers were found. Now I am confused about how good autowarming queries…
The Surrican
  • 29,118
  • 24
  • 122
  • 168
18
votes
1 answer

Slow index speed of Elasticsearch

We deployed ES 2.0 on 3 EC2 c4.4xlarge(16 cores, 32gb memory) nodes, allocating 16G for ES, attached 500GB with io1/4000 IOPS on each. Problem : We are expecting great performance from this hardware config, however a very slow indexing speed is…
PeiSong Xiong
  • 191
  • 1
  • 1
  • 7
18
votes
2 answers

How to boost exact match over multi match in elastic search

I am running a following query to boost exact match over multi_match in elastic search. But, not getting the expected results. My goal is to boost in following order: "java developer" > java AND developer > java OR developer Can someone help in…
Vishal Sharma
  • 591
  • 1
  • 6
  • 15