One of the answers to this question does a good job in explaining how Apache Lucene works, especially the response by Tom Taylor. Here is Tom's response:
Lucene creates a reverse index something like
File 1 :
Term : Random
Frequency : 1
Position : 0
Term : Memory
Frequency : 2
Position : 3
Position : 6
So it is able to search and retrieve the searched content quickly. When there is too many matches for the search query it outputs the result based on the weight. Consider the search query "Main Memory" it searches for all 4 words individually and the result would be like,
Main
File 1 : Frequency - 1
Memory
File 1 : Frequency - 2
File 2 : Frequency - 1
The result would be File1 followed by File2.
My question: Will the above still work if I decide to encrypt "Random" and "Memory" into ciphertext? When I say "still work", I am asking will the search results still be File 1 and File 2 if I search for the cipher text of "Main" and "Memory" ?
In essence, I am asking if it is possible to encrypt the entire Lucene index and use it to perform searches on encrypted queries.