2

I have a slight problem when searching with SOLR 4.0 and attempting a phrase query.

I have a field called "idx_text_general_ci" which is a case insensitive (all lowercased) field made up of all fields.

When I try and search for a phrase (marine fitter) my SOLR refuses to search for the phrase instead splitting the phrase into 2 words -

/select?defType=edismax&q=idx_text_general_ci:marine%20fitter&debugQuery=true

debugQuery=true output below:

<lst name="debug">
<str name="rawquerystring">idx_text_general_ci:marine fitter</str>
<str name="querystring">idx_text_general_ci:marine fitter</str>
<str name="parsedquery">
(+(idx_text_general_ci:marine DisjunctionMaxQuery((id:fitter))))/no_coord
</str>
<str name="parsedquery_toString">+(idx_text_general_ci:marine (id:fitter))</str>

As you can see above it splits the query into 2 parts (idx_text_general_ci:marine then id:fitter).

THe problem I have is that I have an exact match for "marine fitter" that appears twice in the idx_text_general_ci field yet it's ranked with a lesser score than a document with the word "marine" appearing 3 times. I know this will not be the case if my SOLR was to search the field with the phrase as expected.

If I wrap the phrase in quotes I get zero results.

Any help or a nudge in the right direction would be much appreciated.

Thanks in advance

Alex

Alex
  • 21
  • 2

1 Answers1

2

What's happening here is that your default query field appears to be id, and because you're specifying your query as

idx_text_general_ci:marine fitter

it gets translated in Solr as a DisjunctionMaxQuery for idx_text_general_ci:marine and id:fitter. Presumably, you want idx_text_general_ci:marine and idx_text_general_ci:fitter. You have two options: 1) you can prefix each word with the correct field followed by a colon, or you can change the defaultSearchField in schema.xml to be idx_text_general_ci.

I'm baffled as to why you get zero results when you wrap it in double quotes though. But doing the above should help you.

Ansari
  • 8,168
  • 2
  • 23
  • 34
  • Hi. I have just tried that and it doesnt work either. The query(s) I tried are below >>> q=idx_text_general_ci:marine+idx_text_general_ci:fitter and q=idx_text_general_ci:marine%20idx_text_general_ci:fitter Are these the correct way to do it? – Alex Jul 06 '12 at 09:28
  • Can you try this in the Solr admin panel? It's much easier to debug there. If you're typing directly into the browser bar, use a `+` sign not a space. – Ansari Jul 06 '12 at 09:32
  • @Alex In the admin panel you can use the Analyze tab to enter a part of the document and see how that's indexed, and then enter a query and see how that's processed. Check the Debug box and you should be able to figure out where the mismatch is. – Ansari Jul 06 '12 at 09:40
  • This was all from inside the admin panel (which just sends the same query I send to my query page (/select?....) Either way as you can see I used a space in one of the queries which did not work – Alex Jul 06 '12 at 09:43
  • Then you need to see how it's getting indexed :) maybe it's going awry in the stemming or something. Use the Analyze tab to make sure both the index and query processing match up. – Ansari Jul 06 '12 at 09:45
  • I analyzed the query and it shows in a table which to me says it's broken up into words not kept in a sentence/phrase - is this normal? – Alex Jul 06 '12 at 09:46
  • Yes that should be fine. Do the index and query breakdowns match? – Ansari Jul 06 '12 at 09:58
  • Yes, the index is broken into table cells (in the output) by space and the same with the query (in the output) – Alex Jul 06 '12 at 10:05
  • Hrrm I don't know what else to try with long-distance debugging sorry :( – Ansari Jul 06 '12 at 10:09
  • I dont see why phrases in quotes dont work - I think this is the main problem – Alex Jul 06 '12 at 10:17
  • They may work - my phrases in quotes get split up on that interface as well. Why don't you do this: Go to the admin panel, and click on Full Interface. Check the debug enable box, enter your query and see what it's finally being queried as within Solr. That should show you what's happening internally. For me it keeps the quotes. – Ansari Jul 06 '12 at 10:33
  • I did that and it does keep the quotes. I have now re-indexed with autoGeneratePhraseQueries="true" on my field and will see if that makes any difference – Alex Jul 06 '12 at 10:39
  • 2
    Did you ever figure this one out? – Ansari Jul 18 '12 at 05:50