One way would be to use Highlighting.
This is a fairly rich feature, but the following example may help you achieve your goal.
{
"query": {
"match": {
"myfield": "another"
}
},
"highlight": {
"fields": {
"myfield": {
"type": "plain"
}
},
"pre_tags": [""],
"post_tags": [""]
}
}
You may choose to keep the matching text highlighted, or specify empty pre_tags
and post_tags
to just show the original text.
The highlight
field in the response will only include the hits in the original source array that match.
{
...
"hits": {
"total": 1,
"max_score": 0.28582606,
"hits": [
{
"_index": "test",
"_type": "mytype",
"_id": "AWB6-u6V3-7fA7oZt-aX",
"_score": 0.28582606,
"_source": {
"myfield": [
"My favorite toy",
"Another toy for me"
]
},
"highlight": {
"myfield": [
"Another toy for me"
]
}
}
]
}
}
If more than one value in the array matches, they are all returned.
{
...
"hits": {
"total": 1,
"max_score": 0.3938048,
"hits": [
{
"_index": "blah",
"_type": "mytype",
"_id": "AWB6-u6V3-7fA7oZt-aX",
"_score": 0.3938048,
"_source": {
"myfield": [
"My favorite toy",
"Another toy for me"
]
},
"highlight": {
"myfield": [
"My favorite toy",
"Another toy for me"
]
}
}
]
}
}
There are certainly other options, as you mentioned, using a nested document or a parent-child relationship and obtaining the inner hits from those. Highlighting was the only solution I could find that maintains your original document structure.