0

i'm trying to index some long texts with Lucene 4.7, i thought that all was fine but i realise that my search hits are not complete.

After a long search i found a web page that said something like "When I try to index a long text in Lucene, Lucene only index the first n characters to prevent stackoverflows."

I want to index full texts and i don't know how to do it ¿Some hel please?. Here is my code:

    File indexDir = new File(indexPath);
    Directory directory = FSDirectory.open(indexDir);
    IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_47, analyzer);
    config.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
    writer = new IndexWriter(directory, config);
    Document doc = new Document();
    doc.add(new DoubleField("textID", textID, Field.Store.YES));
    doc.add(new TextField("text", text, Field.Store.NO));
    doc.add(new TextField("title", title, Field.Store.NO));
    doc.add(new StringField("discourse", discourse, StringField.Store.YES));
    writer.addDocument(doc);
Calum
  • 1,889
  • 2
  • 18
  • 36
Paco
  • 1
  • You need to demonstrate your problem in the code. The code you pasted is completely standard, working example. Also it would help if you could provide reference to your claim about `first n characters`. I can provide [another reference](http://lucene.472066.n3.nabble.com/Index-one-huge-text-file-td3191605.html) claiming opposite fact: indexing text files having 60K lines works perfectly. – mindas Mar 25 '14 at 09:14
  • My problem is when i try to index long text, and the code that i use to index is the code i've posted. I know that it is and standard code but i posted it because i thought the problem were in the IndexWriter config. – Paco Apr 01 '14 at 14:36

0 Answers0