Is it possible to filter the bucket list result of significant term aggregations using multiple fields to be filtered? I am trying to create a recommendation feature using ES based on this article at medium https://towardsdatascience.com/how-to-build-a-recommendation-engine-quick-and-simple-aec8c71a823e.
I store the search data as array of objects instead of array of strings, because i need other fields to be filtered to get correct bucket list result. Here is the index mapping:
{
"mapping": {
"properties": {
"user": {
"type": "keyword",
"ignore_above": 256
},
"comic_subscribes": {
"properties": {
"genres": {
"type": "keyword",
"ignore_above": 256
},
"id": {
"type": "keyword",
"ignore_above": 256
},
"type": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
I have 2 conditions to be filtered:
- comic_subscribes.type must be "serial" only
- comic_subscribes.genre must not in "hentai" or "echii"
I have already tried two methods to apply the conditions. First i tried to filter it using bool query like this:
{
"size": 0,
"query": {
"bool": {
"should": [
{
"term": {
"comic_subscribes.id": "1"
}
}
],
"minimum_should_match": 1,
"filter": {
"term": {
"comic_subscribes.type": "serial"
}
},
"must_not": [
{
"bool": {
"should": [
{
"term": {
"comic_subscribes.genres": "hentai"
}
},
{
"term": {
"comic_subscribes.genres": "echii"
}
}
],
"minimum_should_match": 1
}
}
]
}
},
"aggs": {
"recommendations": {
"significant_terms": {
"field": "comic_subscribes.id",
"exclude": ["1"],
"min_doc_count": 1,
"size": 10
}
}
}
}
And filter aggregation method:
{
"size": 0,
"query": {
"bool": {
"should": [
{
"term": {
"comic_subscribes.id": "1"
}
}
],
"minimum_should_match": 1
}
},
"aggs": {
"filtered": {
"filter": {
"bool": {
"filter": {
"term": {
"comic_subscribes.type": "serial"
}
},
"must_not": [
{
"bool": {
"should": [
{
"term": {
"comic_subscribes.genres": "hentai"
}
},
{
"term": {
"comic_subscribes.genres": "echii"
}
}
],
"minimum_should_match": 1
}
}
]
}
},
"aggs": {
"recommendations": {
"significant_terms": {
"field": "comic_subscribes.id",
"exclude": ["1"],
"min_doc_count": 1,
"size": 10
}
}
}
}
}
}
But still, both of methods give me unfiltered comic bucket lists. Is it any other way to achieve these required conditions? Should i create one more field which store pre-filtered comic list to be used as source field significant term? Thank you very much.