I want to use Lucene to index documents together with a large amount of weighted tags (weights as probabilities of being true). These fields would all be called 'tag' to allow searches to be targeted on these tags and returning those documents with matching tags but the highest probabilities.
The code below only shows what I would like to do to make it more clear.
However, the field boosting in Lucene is meant to be applied to indexed fields and to the type of field, rather than the instance as added to the document. This means, that the solution below does not work and I would need to use fields with unique names in order to apply boosting to them.
I also know that this is a very bad solution and I wonder if somebody here knows a better way to do this. I would obviously need away to a) store the probabilities and b) have a way to use them in the retrieval process.
private void indexDocuments(IndexWriter writer) throws IOException {
Document docA = new Document();
Field pathFieldA = new StringField("path", "dog.jpg", Field.Store.YES);
docA.add(pathFieldA);
// add all tags to the index
StringField c1 = new StringField("tag", "dog", Field.Store.YES);
c1.setBoost(0.8f);
docA.add(c1);
StringField c2 = new StringField("tag", "cat", Field.Store.YES);
c2.setBoost(0.2f);
docA.add(c2);
Document docB = new Document();
Field pathFieldB = new StringField("path", "cat.jpg", Field.Store.YES);
docB.add(pathFieldB);
// add all tags to the index
StringField tagB1 = new StringField("tag", "dog", Field.Store.YES);
tagB1.setBoost(0.2f);
docB.add(tagB1);
StringField tagB2 = new StringField("tag", "cat", Field.Store.YES);
tagB2.setBoost(0.8f);
docB.add(tagB2);
writer.addDocument(docB);
writer.addDocument(docB);
}