I'm seeing some curious effects when running fuzzy queries inside bool queries in filter vs. query context. I'm on Elasticsearch 6.0.0.
I have an index whose documents have a field firstName
. If I run the following, for example:
{
"query": {
"fuzzy": {
"firstName": {
"value": "yvonne",
"fuzziness": 1
}
}
}
}
I get 5596 hits. Now if I stick the fuzzy term inside a bool must clause:
{
"query": {
"bool": {
"must": [
{
"fuzzy": {
"firstName": {
"value": "yvonne",
"fuzziness": 1
}
}
}
]
}
}
}
I still get 5596. And if I change the must to a filter clause:
{
"query": {
"bool": {
"filter": [
{
"fuzzy": {
"firstName": {
"value": "yvonne",
"fuzziness": 1
}
}
}
]
}
}
}
Same, 5596 again. Unsurprising, right?
Let's change fuzziness
to 2 instead of 1. Running the simple fuzzy term query again:
{
"query": {
"fuzzy": {
"firstName": {
"value": "yvonne",
"fuzziness": 2
}
}
}
}
Now I get 6079 hits. Larger edit distance should match more documents, seems reasonable. Now I'll stick that inside a bool query as a must clause again:
{
"query": {
"bool": {
"must": [
{
"fuzzy": {
"firstName": {
"value": "yvonne",
"fuzziness": 2
}
}
}
]
}
}
}
Still 6079. Now change the must clause to a filter:
{
"query": {
"bool": {
"filter": [
{
"fuzzy": {
"firstName": {
"value": "yvonne",
"fuzziness": 2
}
}
}
]
}
}
}
This returns 7980 hits.
As I understand it, the sole difference between must and filter clauses in a bool query is whether hits are scored or not. But this doesn't seem to be true; running the fuzzy query in a filter context seems to be making the query less selective. What am I missing? What could be causing this?