Here's the solution I ended up with, based on Andrei's answer and expanded to support multiple search terms and additional scoring based on length of the first word in the result:
First, define the following custom analyzer (it keeps the entire string as a single token and lowercases it):
"raw_analyzer": {
"type": "custom",
"filter": [
"lowercase"
],
"tokenizer": "keyword"
}
Second, define your search field mapping like so (mine's named "name"):
"name": {
"type": "string",
"analyzer": "english",
"fields": {
"raw": {
"type": "string",
"index_analyzer": "raw_analyzer",
"search_analyzer": "standard"
}
}
},
"_nameFirstWordLength": {
"type": "long"
}
Third, when populating the index use the following logic (mine's in C#) to populate:
_nameFirstWordLength = fi.Name.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries)[0].Length
Finally, do your search as follows:
{
"query":{
"bool":{
"must":{
"match_phrase_prefix":{
"name":{
"query":"apple"
}
}
},
"should":{
"function_score":{
"query":{
"query_string":{
"fields":[
"name.raw"
],
"query":"apple*"
}
},
"script_score":{
"script":"100/doc['_nameFirstWordLength'].value"
},
"boost_mode":"replace"
}
}
}
}
}
I'm using match_phrase_prefix so that partial matches are supported, such as "ap" matching "apple". The bool must/should with that second query_string query against name.raw gives a higher score to results whose name starts with one of the search terms (in my code I'm pre-processing the search string, just for that second query, to add a "*" after every word). Finally, wrapping that second query in a function_score script that uses the value of _nameFirstWordLength causes the results up-scored by the second query to be further sorted by the length of their first word (causing Apple to show before Applebee's, for example).