2

When I use Infinispan with Hibernate, I need to use Analyzer to find results which includes the key word.

But when I search the keyword SNO_NO_D6-11100 with query like:

QueryBuilder queryBuilder = CSECore.searchManager
                  .buildQueryBuilderForClass(Hierarchy.class).get();
Query query = queryBuilder
        .keyword().onField("path").matching("SNO_NO_D6-11100").createQuery();

It seperatesSNO_NO_D6-11100 to SNO_NO_D6 and 11100 then find them respectively and merge 2 results together. There will be some results incorrect.

If I ignore the analyzer, it will just find the exact match which is also incorrect. Is there a solution that Analyzer can ignore the "-" ??

Stephan
  • 207
  • 1
  • 10

2 Answers2

1

Try a phrase query (see section 5.1.2.4 in the hibernate query dsl) instead:

Query query = queryBuilder.phrase().onField("path").sentence("SNO_NO_D6-11100").createQuery();

The two terms will still be separated, but since it is a phrase query it will search for the two separate terms occurring consecutively. So it will not be able to distinguish between "SNO_NO_D6-11100" and "SNO_NO_D6 11100", but I'm guessing that is probably acceptable.

femtoRgon
  • 32,893
  • 7
  • 60
  • 87
  • Thx for the reponse, that does really helpful. But I just wondering if there is method to write my `Analyzer` like I wrote my own `Custom Bridge`? – Stephan Jun 24 '14 at 09:12
  • You can set the analyzer to whatever you like, in a few different ways, see: [section 1.6: Analyzer](http://docs.jboss.org/hibernate/search/3.4/reference/en-US/html_single/#d0e392) – femtoRgon Jun 25 '14 at 15:20
0

Using Hibernate 5.10.3, we can override analyzers at search time:

FullTextEntityManager fte = Search.getFullTextEntityManager(em);
QueryBuilder qb = fte.getSearchFactory().buildQueryBuilder().forEntity(Article.class)
                .overridesForField("path", "keywordanalyzer")
                .get();

Where ngram_search is a custom analyzer defined as follow:

@AnalyzerDef(name = "keywordanalyzer",
        tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class))

Note that KeywordTokenizerFactory does not split the input, it basically emits the entire input as a single token.

Munish Chandel
  • 3,572
  • 3
  • 24
  • 35