2

I have a document structure like follows:

{
    "_id": ...,
    "name": "Document name",
    "properties": {
        "prop1": "something",
        "2ndprop": "other_prop",
        "other3": ["tag1", "tag2"],
    }
}

I can't know the actual field names in properties subdocument (they are given by the application user), so I can't create indexes like properties.prop1. Neither can I know the structure of the field values, they can be single value, embedded document or array.

Is there any practical way to do performant queries to the collection with this kind of schema design?

One option that came to my mind is to add a new field to the document, index it and set used field names per document into this field.

{
    "_id": ...,
    "name": "Document name",
    "properties": {
        "prop1": "something",
        "2ndprop": "other_prop",
        "other3": ["tag1", "tag2"],
    },
    "property_fields": ["prop1", "2ndprop", "other3"]
}

Now I could first run query against property_fields field and after that let MongoDB scan through the found documents to see whether properties.prop1 contains the required value. This is definitely slower, but could be viable.

Ville Mattila
  • 1,343
  • 3
  • 15
  • 28

1 Answers1

2

One way of dealing with this is to use schema like below.

{
    "name" : "Document name",
    "properties" : [
        {
            "k" : "prop1",
            "v" : "something"
        },
        {
            "k" : "2ndprop",
            "v" : "other_prop"
        },
        {
            "k" : "other3",
            "v" : "tag1"
        },
        {
            "k" : "other3",
            "v" : "tag2"
        }
    ]
}

Then you can index "properties.k" and "properties.v" for example like this:

db.foo.ensureIndex({"properties.k": 1, "properties.v": 1})
zero323
  • 322,348
  • 103
  • 959
  • 935
  • This solution looks very promiainf. Is there a reason why other3 was in two different subdocuments instead ["tag2", "tag3"]? – Ville Mattila Nov 26 '13 at 17:08
  • Because it is easier to work with consistent schema. {k: "other3", v: ["tag1", "tag2"]} should work as well although it will be harder to perform some operations like retrieving individual values using aggregation pipeline. – zero323 Nov 26 '13 at 20:27
  • Eh - why did my "promising" change to "promiainf" in my original comment? Autocomplete, I like it. ;) Thanks for your further comment, this is definitely a great answer to my question. – Ville Mattila Nov 26 '13 at 21:19
  • One more question came to my mind about this answer. If I do `db.foo.ensureIndex({"properties.k": 1, "properties.v": 1})`, what happens with the query `{"properties.k":"prop1", "properties.v":"tag2"}`. This matches the document, even tag2 is not value of field `prop1`. I may need to explicitly check the document contents from search results? – Ville Mattila Dec 19 '13 at 06:48
  • Well, it seems that `$elemMatch` is what is needed here. Seems that MongoDB is able to use the mentioned index structure in that case too. – Ville Mattila Dec 19 '13 at 07:02