Is there a simple way to make Hibernate Search to index all its values in lower case ? Instead of the default mixed-case.
I'm using the annotation @Field. But I can't seem to be able to configure some application-level set
Is there a simple way to make Hibernate Search to index all its values in lower case ? Instead of the default mixed-case.
I'm using the annotation @Field. But I can't seem to be able to configure some application-level set
Fool that I am ! The StandardAnalyzer class is already indexing in lowercase. It's just a matter of setting the search terms in lowercase too. I was assuming the query would do that.
However, if a different analyzer were to be used, application-wide, then it can be set using the property hibernate.search.analyzer.
There are multiple way to make sort insensitive in string type field only.
1.First Way is add @Fields annotation in field/property on entity. Like
@Fields({@Field(index=Index.YES,analyze=Analyze.YES,store=Store.YES), @Field(index=Index.YES,name = "nameSort",analyzer = @Analyzer(impl=KeywordAnalyzer.class), store = Store.YES)})
private String name;
suppose you have name property with custom analyzer and sort on that. so it's not possible then you can add new Field in index with nameSort apply sort on that field. you must apply Keyword Analyzer class because that is not tokeniz field and by default apply lowercase factory class in field.
2.Second way is that you can implement your comparison class on sorting like
@Override
public FieldComparator newComparator(String field, int numHits, int sortPos, boolean reversed) throws IOException {
return new StringValComparator(numHits, field);
}
Make one class with extend FieldComparatorSource class and implement above method.
Created new Class name with StringValComparator and implements FieldComparator and implement following method
class StringValComparator extends FieldComparator {
private String[] values;
private String[] currentReaderValues;
private final String field;
private String bottom;
StringValComparator(int numHits, String field) {
values = new String[numHits];
this.field = field;
}
@Override
public int compare(int slot1, int slot2) {
final String val1 = values[slot1];
final String val2 = values[slot2];
if (val1 == null) {
if (val2 == null) {
return 0;
}
return -1;
} else if (val2 == null) {
return 1;
}
return val1.toLowerCase().compareTo(val2.toLowerCase());
}
@Override
public int compareBottom(int doc) {
final String val2 = currentReaderValues[doc];
if (bottom == null) {
if (val2 == null) {
return 0;
}
return -1;
} else if (val2 == null) {
return 1;
}
return bottom.toLowerCase().compareTo(val2.toLowerCase());
}
@Override
public void copy(int slot, int doc) {
values[slot] = currentReaderValues[doc];
}
@Override
public void setNextReader(IndexReader reader, int docBase) throws IOException {
currentReaderValues = FieldCache.DEFAULT.getStrings(reader, field);
}
@Override
public void setBottom(final int bottom) {
this.bottom = values[bottom];
}
@Override
public String value(int slot) {
return values[slot];
}
}
Apply sorting on Fields Like
new SortField("name",new StringCaseInsensitiveComparator(), true);
Lowercasing, term splitting, removing common terms and many more advanced language processing functions are applied by the Analyzer.
Usually you should process user input meant to match indexed strings with the same Analyzer used at indexing; configuring hibernate.search.analyzer sets the default (global) Analyzer, but you can customize it per index, per entity type, per field and even on different entity instances.
It is for example useful to have language specific analysis, so to process Chinese descriptions with Chinese specific routines, Italian descriptions with Italian tokenizers.
The default analyzer is ok for most use cases, and does lowercasing and splits terms on whitespace.
Consider as well that when using the Lucene Queryparser the API requests you the appropriate Analyzer.
When using the Hibernate Search QueryBuilder it attempts to apply the correct Analyzer on each field; see also http://docs.jboss.org/hibernate/search/4.1/reference/en-US/html_single/#search-query-querydsl .