0

I'm using Hibernate Search with Lucene and would like to be able to add an AnalyzerDef based on a Filter name (fetching, e.g., this) that is provided in a config file, loaded when the application starts.

Right now I've got code like

@AnalyzerDefs({
        @AnalyzerDef(name = "phraseAnalyzer",
                tokenizer = @TokenizerDef(factory = KeywordTokenizerFactory.class),
                filters = {
                        @TokenFilterDef(factory = ClassicFilterFactory.class),
                        @TokenFilterDef(factory = LowerCaseFilterFactory.class)
                }),
})
@MappedSuperclass
public abstract class MyObject {

I looked at the docs and it didn't pop out at me how I would do it.

I suspect this question might be related.

brandones
  • 1,847
  • 2
  • 18
  • 36
  • If the linked question is indeed related, the ticket https://hibernate.atlassian.net/browse/HSEARCH-2518 claims that this should be possible in Hibernate Search 6.0. – brandones Oct 15 '19 at 23:34
  • You can give a name to an analyzer instance in Search 6, but you can't use the `AnalyzerDiscriminator`. There would be ways around that, but first: what is the need exactly? When you say "provided at runtime", do you mean "when starting the application" or "for each HTTP request" or "for each entity instance"? Do you need to change the analyzer when querying, or when indexing? – yrodiere Oct 16 '19 at 06:36
  • When starting the application. I would like to specify the Filter[Factory] or Analyzer in a configuration file. I'll update the question. I assume a reindex would be needed after changing the analyzer? – brandones Oct 16 '19 at 14:09
  • Answered below. And yes, you will need to reindex when you change the analyzers, regardless of how you do it (you would need to reindex even if you changed the annotations). – yrodiere Oct 16 '19 at 14:39

1 Answers1

2

In Hibernate Search 5.11 (and since 5.6 or 5.7, IIRC), you can define analyzers programmatically using a LuceneAnalysisDefinitionProvider.

Implement the interface:

public class CustomAnalysisDefinitionProvider implements LuceneAnalysisDefinitionProvider {
    @Override
    public void register(LuceneAnalyzerDefinitionRegistryBuilder builder) {
        builder.analyzer( "myAnalyzer" )
                        .tokenizer( KeywordTokenizerFactory.class )
                        .tokenFilter( ClassicFilterFactory.class )
                        .tokenFilter( LowerCaseFilterFactory.class )
                        .tokenFilter( StopFilterFactory.class )
                                // You can pass parameters like this
                                .param( "mapping", "org/hibernate/search/test/analyzer/stoplist.properties" )
                                .param( "ignoreCase", "true" );

        // You can define multiple analyzers
        builder.analyzer( "otherAnalyzer" )
                        .tokenizer( ... ) ...
    }
}

Then tell Hibernate Search to use it:

# In properties.java
hibernate.search.lucene.analysis_definition_provider = com.mycompany.CustomAnalysisDefinitionProvider;

You're free to do whatever you want in the implementation of register, so potentially you could check out system properties or even load configuration files. If you have a limited set of implementations, you can also directly override the definition provider when starting the JVM by setting hibernate.search.lucene.analysis_definition_provider through the system properties.

See https://docs.jboss.org/hibernate/search/5.11/reference/en-US/html_single/#section-programmatic-analyzer-definition for details.

In Hibernate Search 6, APIs are a bit different, but they follow the same core principles: https://docs.jboss.org/hibernate/search/6.0/reference/en-US/html_single/#backend-lucene-analysis , and you can even inject Spring/CDI beans into the analysis configurer (you can't do that in Search 5, or at least Hibernate Search won't help you to do it).

yrodiere
  • 9,280
  • 1
  • 13
  • 35