Questions tagged [lucene]

The term Lucene refers to the open source Java fulltext search engine library, but also to the entire eco-system that grew around it, including lucene.net, solr, elasticsearch and zend-search-lucene.

The term "Lucene" refers to the open source Java fulltext search engine library, and also to the entire eco-system that grew around it, including , , and . "Lucene" may also be used to refer to top-level projects like Nutch and Tika which were once sub-projects of Lucene.

Use the "Lucene" tag if either:

  • The question is about the Java library
  • The question is about a port of the library, but would make sense to people who know the Java library (many Lucene.NET questions match this criteria).
  • The question is so general it doesn't apply to a specific implementation (example).

References:

Basic Demo:

A basic "getting started" demo showing how to build and query an index is provided as part of the official documentation:

Basic Demo documentation - (this link is for Lucene v8.7.0. Newer versions may be available)

Links to the demo's source files are provided in the above documentation.

The source code can also be found here on GitHub.

Luke - a Lucene GUI Client:

Luke is a GUI client application which can be used to explore your Lucene indexes. Recent versions of Luke are now provided as part of each binary release, which can be downloaded from here.

After downloading the binary release, unzip it, and go to the luke directory. Launch the client using the provided luke.bat or luke.sh scripts.

11993 questions
34
votes
3 answers

Indexing .PDF, .XLS, .DOC, .PPT using Lucene.NET

I've heard of Lucene.Net and I've heard of Apache Tika. The question is - how do I index these documents using C# vs Java? I think the issue is that there is no .Net equivalent of Tika which extracts relevant text from these document types. UPDATE…
dana
  • 17,267
  • 6
  • 64
  • 88
34
votes
4 answers

How to incorporate multiple fields in QueryParser?

Dim qp1 As New QueryParser("filename", New StandardAnalyzer()) Dim qp2 As New QueryParser("filetext", New StandardAnalyzer()) . . I am using the 'Lucene.Net' library and have the following question. Instead of creating two separate QueryParser…
user57175
  • 3,284
  • 9
  • 32
  • 26
34
votes
5 answers

Lucene.Net Best Practices

What are the best practices in using Lucene.Net? or where can I find a good lucene.net usage sample?
Elias Haileselassie
  • 1,385
  • 1
  • 18
  • 26
33
votes
10 answers

Is MongoDB a valid alternative to relational db + lucene?

On a new project I need a hard use of lucene for a searcher implementation. This searcher will be a very important (and big) piece of the project. Is valid or convenient replacing Relational Database + Lucene with MongoDb? edit: Ok, I will clarify:…
Hugo
  • 2,139
  • 4
  • 22
  • 30
32
votes
3 answers

Filename search with ElasticSearch

I want to use ElasticSearch to search filenames (not the file's content). Therefore I need to find a part of the filename (exact match, no fuzzy search). Example: I have files with the following…
Biggie
  • 7,037
  • 10
  • 33
  • 42
32
votes
7 answers

N-gram generation from a sentence

How to generate an n-gram of a string like: String Input="This is my car." I want to generate n-gram with this input: Input Ngram size = 3 Output should be: This is my car This is is my my car This is my is my car Give some idea in Java, how to…
Preetam Purbia
  • 5,736
  • 3
  • 24
  • 26
31
votes
6 answers

How to make the Lucene QueryParser more forgiving?

I'm using Lucene.net, but I am tagging this question for both .NET and Java versions because the API is the same and I'm hoping there are solutions on both platforms. I'm sure other people have addressed this issue, but I haven't been able to find…
Winston Fassett
  • 3,500
  • 3
  • 36
  • 29
31
votes
2 answers

Elasticsearch - How to normalize score when combining regular query and function_score?

Idealy what I am trying to achieve is to assign weights to queries such that query1 constitutes 30% of the final score and query2 consitutes other 70%, so to achieve the maximum score a document has to have highest possible score on query1 and…
JohnnyM
  • 1,273
  • 1
  • 13
  • 26
31
votes
8 answers

Full Text Searching with Rails

I've been looking into searching plugins/gems for Rails. Most of the articles compare Ferret (Lucene) to Ultrasphinx or possibly Thinking Sphinx, but none that talk about SearchLogic. Does anyone have any clues as to how that one compares? What…
Matt Grande
  • 11,964
  • 6
  • 62
  • 89
30
votes
5 answers

not query in lucene

i need to do not queries on my lucene index. Lucene currently allows not only when we have two or more terms in the query: So I can do something like: country:canada not sweden but I can't run a query like: country:not sweden Could you please let…
Ted Rogati
  • 301
  • 1
  • 3
  • 3
30
votes
8 answers

How to fix: Error CREATEing SolrCore 'gettingstarted': Unable to create core

I'm getting this error when I try to create a new core in solr. root@ubuntu:/opt/solr# bin/solr create -c gettingstarted -n data_driven_schema_configs Setup new core instance directory: /var/solr/data/gettingstarted Creating new core…
JackXandar
  • 503
  • 1
  • 5
  • 14
30
votes
2 answers

Lucene Hebrew analyzer

Does anybody know whether one exists? I've been googling this for monthes... Thanks
Roey
  • 849
  • 2
  • 11
  • 20
29
votes
1 answer

Implement Lucene on Existing .NET / SQL Server stack with multiple webservers

I want to look at using Lucene for a fulltext search solution for a site that I currently manage. The site is built entirely on SQL Server 2008 / C# .NET 4 technologies. The data I'm looking to index is actually quite simple, with only a couple of…
growse
  • 3,554
  • 9
  • 43
  • 66
29
votes
9 answers

Which are the best alternatives to Lucene?

It may run on Unix and it will be used for email searching (Dovecot, Postfix and maildir). Lucene is not a problem, I'm just analyzing some alternatives.
Rui Carneiro
  • 5,595
  • 5
  • 33
  • 39
29
votes
7 answers

Stemming English words with Lucene

I'm processing some English texts in a Java application, and I need to stem them. For example, from the text "amenities/amenity" I need to get "amenit". The function looks like: String stemTerm(String term){ ... } I've found the Lucene Analyzer,…
Mulone
  • 3,603
  • 9
  • 47
  • 69