How datas are stored in lucene

Question

I know that lucene creates an index and stores all the data .Can any one tell me how the data is stored in flat file? or what kind of algorithms they use to store the data in backend so that they can retrieve it quickly?

score 8 · Answer 1 · answered Feb 01 '12 at 18:14

Don't know if this is what you asked for. But the more general answer is that they use/implement a Inverted Index. The specifics of how Lucene stores it you can find in file formats (as milan said).

But the general idea is that they store a Inverted Index data structure and other auxiliar data structures to help answer queries quickly. For example, it stores a vector of norms for each document and each term's IDF (inverse document frequency). Lucene also stores the actual document fields, but that is outside the Inverted Index.

score 5 · Answer 2 · answered Feb 01 '12 at 11:54

5

You can find all that explained in the file formats section.

answered Feb 01 '12 at 11:54

milan

11,872
3
42
49

score 4 · Answer 3 · answered Feb 01 '12 at 08:31

4

You can read this book http://nlp.stanford.edu/IR-book/ to know about the data structures, algorithms and models used in information retrieval systems

answered Feb 01 '12 at 08:31

naresh

2,113
20
32

1

It is a good entry level book, but it is a bit not relevant to this problem, still a good reference. – linjunhalida Oct 21 '13 at 12:53
2

There's also another great book of information retrieval which offers free content now: https://ciir.cs.umass.edu/irbook/ – realjin Dec 26 '16 at 01:35

How datas are stored in lucene

3 Answers3