1

I have build an index using a StandardAnalyzer, in this index are a few fields. For example purposes, imagine it has Id and Type. Both are NON_ANALYZED, meaning you can only search for them as-is.

There are a few entries in my index:

 {Id: "1", Type: "Location"},
 {Id: "2", Type: "Group"},
 {Id: "3", Type: "Location"}

When I search for +Id:1 or any other number, I get the appropriate result (again using StandardAnalyzer).

However, when I search for +Type:Location or the +Type:Group, I'm not getting any results. The strange thing is that when I enable leading wildcards, that +Type:*ocation does return results! +Type:*Location or other combinations do not.

This got me leading to believe the indexer/query doesn't like uppercase characters! After lowercasing the Type to location and group before indexing them, I could search for them as such.

If I turn the Type-field to ANALYZED, it works with pretty much any search (uppercase/lowercase, etc), but I want to query for the Type-field as-is.

I'm completely baffled why it's doing this. Could anyone explain to me why my indexer doesn't let me search for NON_ANALYZED fields that have a capital in their value?

Lennard Fonteijn
  • 2,561
  • 2
  • 24
  • 39

1 Answers1

2

Are you using StandardAnalyzer when parsing your your query string (+Type:Location)? The StandardAnalyzer will lower-case all terms, so you're really searching with +Type:location.

Always use the same analyzer when searching and indexing. Look into using the PerFieldAnalyzer and set the Type field to use the KeywordAnalyzer.

sisve
  • 19,501
  • 3
  • 53
  • 95
  • I've used StandardAnalyzer for both the Index and the QueryParser though, that's the weird part. Doesn't the QueryParser look at the Index itself to determine whether or not it should actually use the StandardAnalyzer? I understood NOT_ANALYZED means the analysing part is skipped during indexing, and I assumed that when I query for a field with that set, it does the same again. But apparently, it doesn't. – Lennard Fonteijn Jan 18 '16 at 22:23
  • When you think about it; how could the QueryParser know anything about the index without having an IndexReader? – sisve Jan 19 '16 at 05:37
  • Very true... How can I tell my query-parser that one field is NOT_ANALYZED and should treat it as such, if at all possible? – Lennard Fonteijn Jan 19 '16 at 08:32
  • Better yet, how can I tell the QueryParser to treat a certain notation as a TermQuery - basically search as-is? I tried using "[Location]", since that seemed the most sensible, but it doesn't understand that. – Lennard Fonteijn Jan 19 '16 at 08:54
  • Have you tried using the PerFieldAnalyzer and using KeywordAnalyzer for the Type field? – sisve Jan 19 '16 at 11:46
  • That works yes, and is probably the way to go since I don't feel much for writing my own query parser :P Was just wondering if the QueryParser might have a notation to search for something as-is, unanalyzed. – Lennard Fonteijn Jan 19 '16 at 13:33