I have a problem with boosting when using Solr. We recently switched from Lucene to Solr.
We have 4 (primary) search fields that we search against: essence, keywords, allSearchable, and quality; where, for each document in the index, essence contains the first 3 non-stop words in keywords. 'keywords' is just a list of keywords. And 'allSearchable' holds data that is just a collection of other data for a given document. What we did in lucene was to do 3 searches for any given search that a user typed into the search box (in order to rank the search results by relevance), like so:
word typed into searchbox: tree
Query 1: +essence:tree
(sort by 'quality')
if Query 1 returns enough for the page we're wanting to get, then return.
Query 2: +keywords:tree
(sort by 'quality')
if the combination of Query 1 and Query 2 returned enough results for the page we're on, then return the results.
Query 3: +allSearchable:tree
(sort by 'quality')
Return the results. If there aren't any, then tough luck.
My problem is with pagination. I did not used to have to send pagination (startIndex, rows) to Lucene. I could just ask for everything, and then roll over everything that I get back, collecting enough results to return, depending on the page I was asking for. With Solr, I must pass pagination parameters. We have over 8 million documents in our index, so to get everything that matches a query like 'tree' is way too expensive. The problem is that if I ask for page 3 in Query 1, and I don't get enough results, then I must go on to query 2 (keywords:tree). But this isn't right, because I am asking for page 3's results for query 2 (in other words, give me all documents that match 'keywords:tree' for page 3). But that's not really the question I want to ask. I only want to ask for page 1 of keywords if essence doesn't match anything. And so on.
What I am really looking for is ONE query, that would suffice for these three queries that I did before, such that I get back the essence matches first, the keyword matches second, and the allSearchable matches last.
I tried using boosting with this query: essence:tree^4.0 keywords:tree^2.0 allSearchable:tree^1.0
But this doesn't seem to do the trick, and I don't know why? I took out the sorts, and things still don't give me back the correct results. I am using the default StandardRequestHandler (which seems to use the LuceneQueryParser (not dismax or edismax). I can see that boosts are being sent to solr in the URL (I use boosting by adding a qf parameter to the defaults section of my requestHandler in solrconfig.xml). I certainly know that lucene can understand these parameters. Can anyone tell me how I might be able to construct one query that would allow me to get results like I want as outlined above?enter code here