3

I am trying to query documentDB with 500M documents (1Tb).

var t1 = Date.now();
'Total X Records:';
db.runCommand({aggregate: "house",
pipeline: [{$project: {'_id': 1, 'foo.x': 1}},
{$match: {'foo.x.y': {$in: ['2018-12-15']}}},
{$unwind: '$foo.x'},
{$match: {'foo.x.y': {$in: ['2018-12-15']}}},
{$group: {'_id': null, 'count': {$sum: 1}}}],
cursor:{},
allowDiskUse: true,
maxTimeMS:0
});

var t2 = Date.now();
print("Time in ms: ")
print(t2-t1);

The same query runs in mongo cluster (10 mongod) ~1hr.

When I run the same query in DocumentDB (6 instances db.r4.xlarge) it throws an error after 2hr.

{ "ok" : 0, "errmsg" : "operation was interrupted", "code" : 11601 }
Time in ms: 
7226913
bye
Stennie
  • 63,885
  • 14
  • 149
  • 175
oshaiken
  • 2,593
  • 1
  • 15
  • 25

2 Answers2

3

AWS DocumentDB queries time out after 2 hours with the default settings. They don’t currently support the maxTimeMS setting as of date.

bsanhotra
  • 46
  • 1
  • 2
    It is not in the documentation yet, I have directly worked with the AWS tech team and they would release it soon in their notes. – bsanhotra Jun 13 '19 at 19:56
0

Currently, $in is not supported.

https://docs.aws.amazon.com/documentdb/latest/developerguide/mongo-apis-aggregation-pipeline.html supported array opt documentdb

oshaiken
  • 2,593
  • 1
  • 15
  • 25
  • I dont see any problems with the query. I think you are misunderstanding when I say query – oshaiken Jun 04 '19 at 19:14
  • 2
    All array operators are now supported except for $reduce, https://docs.aws.amazon.com/documentdb/latest/developerguide/mongo-apis.html – tmcallaghan Jul 27 '21 at 16:42