I am facing some difficulties while trying to create a query that can match only whole phrases, but allows wildcards as well.
Basically I have a filed that contains a string (it is actually a list of strings, but for simplicity I am skipping that), which can contain white spaces or be null, lets call it "color".
For example:
{
...
"color": "Dull carmine pink"
...
}
My queries need to be able to do the following:
- search for null values (inclusive and exclusive)
- search for non null values (inclusive and exclusive)
- search for and match only a whole phrase (inclusive and exclusive). For example:
- dull carmine pink --> match
- carmine pink --> not a match
- same as the last, but with wildcards (inclusive and exclusive). For example:
- ?ull carmine p* --> match to "Dull carmine pink"
- dull carmine* -> match to "Dull carmine pink"
- etc.
I have been bumping my head against the wall for a few days with this and I have tried almost every type of query I could think of.
I have only managed to make it work partially with a span_near query with the help of this topic.
So basically I can now:
search for a whole phrase with/without wildcards like this:
{ "span_near": { "clauses": [ { "span_term": {"color": "dull"} }, { "span_term": {"color": "carmine"} }, { "span_multi": {"match": {"wildcard": {"color": "p*"}}} } ], "slop": 0, "in_order": true } }
search for null values (inclusive and exclusive) by simple must/must_not queries like this:
{ "must" / "must_not": {'exist': {'field': 'color'}} }
The problem: I cannot find a way to make an exclusive span query. The only way I can find is this. But it requires both include & exclude fields, and I am only trying to exclude some fields, all others must be returned. Is there some analog of the "match_all":{} query that can work inside of an span_not's include field? Or perhaps an entire new, more elegant solution?