2

I have a document with an embedded array. The array is just a bunch of strings. I recently came across some performance issues, so I decided to create an index. But it won't allow me to because the "key is too large to index".

I'm using AWS DocumentDB.

A sample doc looks like this:

{
  _id: (mongoID),
  id: (uuid),
  employees: [(uuid of another user), ...]
}

I saw the question Cannot create index in mongodb, "key too large to index" but I didn't really see how the solution applies to my question.

blockhead
  • 9,655
  • 3
  • 43
  • 69
  • Can you provide a sample document and index specification. – Wernfried Domscheit Feb 28 '20 at 09:24
  • 1
    Does this answer your question? [Cannot create index in mongodb, "key too large to index"](https://stackoverflow.com/questions/27792706/cannot-create-index-in-mongodb-key-too-large-to-index) – Valijon Feb 28 '20 at 09:25
  • Which MongoDB version do you use? The limit should be removed in most recent version, see [Index Key Limit](https://docs.mongodb.com/manual/reference/limits/#indexes) – Wernfried Domscheit Feb 28 '20 at 09:28
  • @WernfriedDomscheit I'm using AWS DocumentDB, as I wrote in the question. Also added a sample doc. – blockhead Feb 28 '20 at 09:31
  • @Valijon I saw that answer but didn't see how it applied to my specific question. – blockhead Feb 28 '20 at 09:32
  • The linked question seems to be exactly the same problem as yours. Where do you see the difference? – Wernfried Domscheit Feb 28 '20 at 09:34
  • To me it sounds like it's talking about a single text field – blockhead Feb 28 '20 at 09:53
  • > To index a field that holds an array value, MongoDB creates an index key for each element in the array. – blockhead Feb 28 '20 at 09:54
  • Shouldn't have mean each element in the array has to be under the limit? – blockhead Feb 28 '20 at 09:55
  • Try to run this: `db.collection.aggregate([{$unwind:"$employees"},{$group:{_id:"$employees", size:{$first:{"$strLenBytes":"$employees"}}}},{$sort:{size:-1}}])` – Valijon Feb 28 '20 at 10:19
  • The top entry is 36 – blockhead Feb 28 '20 at 10:25
  • Sorry, execute this one: `db.collection.aggregate([{$unwind:"$employees"},{$group:{_id:"$_id", size:{$sum:{"$strLenBytes":"$employees"}}}},{$sort:{size:-1}}])` – Valijon Feb 28 '20 at 10:49
  • Top entry is an array of 180507 entries...which I guess means that there are 180507 entries in that array? I will check that out since that should not be the case – blockhead Feb 28 '20 at 11:07
  • No. It's total bytes per characters for each items. for instance: `["a", "b"] = 2`, `["aa", "bb"] = 4`, etc... Can you share please that document to check if it has something wrong? – Valijon Feb 28 '20 at 11:16
  • it's definitely the amount in the array (I checked). I have a bug in my code, but I see somebody else just posted something which directly answers my question (in a way that I did not hope) – blockhead Feb 28 '20 at 14:39

2 Answers2

1

We just updated this functionality, you can now create an index on arrays greater than 2048 bytes and create a compound multi-key index with multiple keys in the same array.

https://aws.amazon.com/about-aws/whats-new/2020/04/amazon-documentdb-adds-improved-multi-key-indexing-capabilities/

Joseph Idziorek
  • 4,853
  • 6
  • 23
  • 37
0

Array Indexing

Amazon DocumentDB indexes an array as a single entry. Arrays larger than 2048 bytes cannot currently be indexed.

https://docs.aws.amazon.com/documentdb/latest/developerguide/functional-differences.html#functional-differences.array-indexing

If employees has more than 56 entries, you cannot create index

Valijon
  • 12,667
  • 4
  • 34
  • 67
  • Creating an index on an array larger than 2048 bytes has been added: https://aws.amazon.com/about-aws/whats-new/2020/04/amazon-documentdb-adds-improved-multi-key-indexing-capabilities/ – Joseph Idziorek Apr 25 '20 at 13:20