What is the difference between new search_as_you_type datatype in Elasticsearch and tokenizer type edge_ngram? Which one to prefer in building search-as-you-type search engine?
Documentation of Elasticsearch gives both implementations:
search_as_you_type datatype: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-as-you-type.html
tokenizer type edge_ngram: https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-edgengram-tokenizer.html (Look at the example of how to set up a field for search-as-you-type.)
UPDATE
Elasticsearch version : 7.6.1
I indexed my data with a data type search_as_you_type according to the latest Elasticsearch documentation and trying to build a simple query via Java API based on the example below:
GET my_index/_search
{
"query": {
"multi_match": {
"query": "brown f",
"type": "bool_prefix",
"fields": [
"my_field",
"my_field._2gram",
"my_field._3gram"
]
}
}
}
The point that I struggle with is adding "type": "bool_prefix"
.
A) I tried with MultiMatchQueryBuilder
MultiMatchQueryBuilder multiMatchQueryBuilder=new MultiMatchQueryBuilder(value, fields);
multiMatchQueryBuilder.type(MatchQuery.Type.BOOLEAN_PREFIX);
and got an exception at the second line of above code:
org.elasticsearch.ElasticsearchParseException: failed to parse [multi_match] query type [boolean_prefix]. unknown type.
B) Then I tried with MatchBoolPrefixQueryBuilder
MatchBoolPrefixQueryBuilder matchBoolPrefixQueryBuilder=new MatchBoolPrefixQueryBuilder(value, fields);
got an exception
org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=parsing_exception, reason=[match_bool_prefix] unknown token [START_ARRAY] after [query]]
...
Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://localhost:9200], URI [/my_dictionary/_search?pre_filter_shard_size=128&typed_keys=true&max_concurrent_shard_requests=5&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&ignore_throttled=true&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=true], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"parsing_exception","reason":"[match_bool_prefix] unknown token [START_ARRAY] after [query]","line":1,"col":57}],"type":"parsing_exception","reason":"[match_bool_prefix] unknown token [START_ARRAY] after [query]","line":1,"col":57},"status":400}
at line
SearchResponse searchResponse=restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
What am I doing wrong? Which one should I use and how?
SOLUTION
I solved the issue just by changing the type to:
MultiMatchQueryBuilder multiMatchQueryBuilder=new MultiMatchQueryBuilder(value, fields);
multiMatchQueryBuilder.type("bool_prefix");
But I don't understand why the type must be hardcoded as "bool_prefix"
instead of using MatchQuery.Type.BOOLEAN_PREFIX
or why not possible to use MatchBoolPrefixQueryBuilder
, there is no much implementation examples of this query.