1

I have a query like this:

xml_db.find(
    {
        'high_performer': {
            '$nin': [some_value]
        },
        'low_performer': {
            '$nin': [some_value]
        },
        'expiration_date': {
            '$gte': datetime.now().strftime('%Y-%m-%d')
        },
        'source': 'some_value'        
    }
)

I have tried to create an index with those fields but getting error:

pymongo.errors.OperationFailure: cannot index parallel arrays [low_performer] [high_performer]

So, how to efficiently run this query?

Ahsanul Haque
  • 10,676
  • 4
  • 41
  • 57

1 Answers1

1

Compound indexing ordering should follow the equality --> sort --> range rule. A good description of this can be found in this response.

This means that the first field in the index would be source, followed by the range filters (expiration_date, low_performer and high_performer).

As you noticed, one of the "performer" fields cannot be included in the index since only a single array can be indexed. You should use your knowledge of the data set to determine which filter (low_performer or high_performer) would be more selective and choose that filter to be included in the index.

Assuming that high_performer is more selective, the only remaining step would be to determine the ordering between expiration_date and high_performer. Again, you should use your knowledge of the data set to make this determination based on selectivity.

Assuming expiration_date is more selective, the index to create would then be:

{ "source" : 1, "expiration_date" : 1, "high_performer" : 1 }
Adam Harrison
  • 3,323
  • 2
  • 17
  • 25