1

To search for adjacent words for nested (or not) queries, the solution is the following (see here for the answer):

{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "metadata",
            "query": {
              "bool": {
                "must": [
                  {
                    "wildcard": {
                      "metadata.text": "*antonio*"
                    }
                  },
                  {
                    "wildcard": {
                      "metadata.text": "*banderas*"
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

This works OK. But, supposed to have multiple nested fields in which search *antonio* *banderas* in the same way, let's say now we have this mapping:

{
    "mappings:": {
        "properties": {
            "text": {
                "type": "text"
            },
            "metadata": {
                "type": "nested",
                "properties": {
                    "text": {
                        "type": "text"
                    }
                }
            },
            "other_metadata": {
                "type": "nested",
                "properties": {
                    "text": {
                        "type": "text"
                    }
                }
            }
        }
    }
}

If I want to search the adjacent words in both nested fields metadata and other_metadata shall I use match or should? I want to have a result that matches at least one of the patterns metadata or other_metadata, so I thought to use should and set minimum_should_match to the number of tokens of the query (separated by a \s - space char) in this way:

{
    "should": [{
            "nested": {
                "path": "metadata",
                "query": {
                    "bool": {
                        "must": {
                            "wildcard": {
                                "metadata.text": "*antonio*"
                            }
                        }
                    }
                },
                "ignore_unmapped": true
            }
        },
        {
            "nested": {
                "path": "metadata",
                "query": {
                    "bool": {
                        "must": {
                            "wildcard": {
                                "metadata.text": "*banderas*"
                            }
                        }
                    }
                },
                "ignore_unmapped": true
            }
        },
        {
            "nested": {
                "path": "other_metadata",
                "query": {
                    "bool": {
                        "must": {
                            "wildcard": {
                                "other_metadata.text": "*antonio*"
                            }
                        }
                    }
                },
                "ignore_unmapped": true
            }
        },
        {
            "nested": {
                "path": "other_metadata",
                "query": {
                    "bool": {
                        "must": {
                            "wildcard": {
                                "other_metadata.text": "*banderas*"
                            }
                        }
                    }
                },
                "ignore_unmapped": true
            }
        }
    ],
    "minimum_should_match": 2
}

This seems to work, but my doubt is the following: the minimum_should_match=2 condition here will assure that at least two ones of those four conditions match, but not that those two matching conditions are both related to the same pattern (like metadata for both words *antonio* and *banderas*. If so, how to ensure that? Using must maybe? But how?

loretoparisi
  • 15,724
  • 11
  • 102
  • 146

1 Answers1

1

You can do kind of sub queries like this :

bool => should => bool => filter/must/should

{
  "query": {
    "bool": {
      "minimum_should_match": 1,
      "should": [
        {
          "bool": {
            "must": [
              {
                "nested": {
                  "ignore_unmapped": true,
                  "path": "metadata",
                  "query": {
                    "bool": {
                      "must": {
                        "wildcard": {
                          "metadata.text": "*antonio*"
                        }
                      }
                    }
                  }
                }
              },
              {
                "nested": {
                  "ignore_unmapped": true,
                  "path": "metadata",
                  "query": {
                    "bool": {
                      "must": {
                        "wildcard": {
                          "metadata.text": "*banderas*"
                        }
                      }
                    }
                  }
                }
              }
            ]
          }
        },
        {
          "bool": {
            "must": [
              {
                "nested": {
                  "ignore_unmapped": true,
                  "path": "other_metadata",
                  "query": {
                    "bool": {
                      "must": {
                        "wildcard": {
                          "other_metadata.text": "*antonio*"
                        }
                      }
                    }
                  }
                }
              },
              {
                "nested": {
                  "ignore_unmapped": true,
                  "path": "other_metadata",
                  "query": {
                    "bool": {
                      "must": {
                        "wildcard": {
                          "other_metadata.text": "*banderas*"
                        }
                      }
                    }
                  }
                }
              }
            ]
          }
        }
      ]
    }
  }
}
ExploZe
  • 426
  • 4
  • 9
  • Thank you! This works almost perfectly, but there is a sort of "drawback": If you search like `*marq* *marquez*` you will get `gabriel garcias marquez`, apparenty due to the `marq` present in `marquez`, while I would expect no results or a result with at least both the words `marq marquez`. Is this due the trailing `*`? – loretoparisi Sep 17 '21 at 17:16
  • 1
    `*` here is only necessary if you want to match more than `marq` like `randommarq` or `marqrandom` or `randmarqrand` etc.. if you only want `marq` then remove `*` – ExploZe Sep 18 '21 at 08:47
  • Okay, you are right, I think I'v got it! Basically for some fields, let's say `text` I have an `autocomplete` custom analyzer - see here https://gist.github.com/loretoparisi/dc1cdd4dea29a83e326a81a00fae2775 I assume that having this auto completion analyzer + a `wildcard` on the same field may be "invalidate" the exact search with `* + term + *` or `term`? – loretoparisi Sep 20 '21 at 17:16