0

I am new to lucene.I am confused about the indexing part.I have a resource where I have id,name,list of products and list of keywords. I am making name,products and keywords searchable(storing,analyzing and tokenizing them).I want to know how the indexing will be applied on my documents?Is it like a hash map as such?

Document d = new Document();
d.add(new TextField("name", cr.getData().name, Field.Store.YES));
for (int i = 0; i < cr.getData().products.size(); i++)
    d.add(new TextField("products", cr.getData().products.get(i),
    Field.Store.YES));
    for (int i = 0; i < cr.getData().keywords.size(); i++)
    d.add(new TextField("keywords", cr.getData().keywords.get(i),
                    Field.Store.YES));
        d.add(new StringField("id", cr.getData().id, Field.Store.YES));
        iw.addDocument(d);
user3701803
  • 93
  • 1
  • 10

1 Answers1

0

No, this isn't a hashmap but inverted index.

First chapters of Information Retrieval book should give you more details on how this works.

mindas
  • 26,463
  • 15
  • 97
  • 154
  • The confusion occurred because, I created documents,indexed it and then in my search resultset, I was receiving the same documents that had the search term.So, Its like we are storing documents,retrieving the same in the search result. – user3701803 Nov 26 '14 at 10:23
  • If you store the document elsewhere, there is little point in storing the data (except for id). You can retrieve the document, load the ID and get the contents from your primary source. But I am not sure if I understand your problem. – mindas Nov 26 '14 at 10:27
  • yeah, the problem is according to my document structure as described in the question,when I have a considerable amount of resources,I will have that much number of documents and hence when I perform a generic term search, I will get exactly the same number of documents. – user3701803 Nov 26 '14 at 10:45
  • And why is it a problem? You can just take first n hits, `IndexSearcher` supports that. – mindas Nov 26 '14 at 10:58
  • Thanks Mindas. In my scenario, when I am searching for a keyword, it returns me documents that has that keyword.If I want to get all products that has the keyword, then I will have to group by products.Grouping is possible in lucene, but can it be done on multivalued fields also? Or I have to modify my document structure? – user3701803 Nov 27 '14 at 08:21
  • Grouping is a database term, don't think in those terms. If you want to query based on fields presence, [this question](http://stackoverflow.com/questions/3710089/find-all-lucene-documents-having-a-certain-field) gives you an answer. P.s. if you have more questions, please move on - open new questions and explain all details instead of adding more and more comments. – mindas Nov 27 '14 at 11:27