Removing properties completely from documents is not a trivial thing. The basics are that the server itself has not had any way of doing this prior to MongoDB 3.4 and the introduction of $replaceRoot
, which essentially allows an expression to be returned as the document context.
Even with that addition it's somewhat impractical to do so without further features of $objectToArray
and $arrayToObject
as introduced in MongoDB 3.4.4. But to run through the cases.
Working with a quick sample
{ "_id" : ObjectId("59adff0aad465e105d91374c"), "a" : 1 }
{ "_id" : ObjectId("59adff0aad465e105d91374d"), "a" : {} }
Conditionally return root object
db.junk.aggregate([
{ "$replaceRoot": {
"newRoot": {
"$cond": {
"if": { "$ne": [ "$a", {} ] },
"then": "$$ROOT",
"else": { "_id": "$_id" }
}
}
}}
])
That's a pretty simple principle and can in fact be applied to any nested property to remove it's sub-keys but would require various levels of nesting $cond
or even $switch
to apply possible conditions. The $replaceRoot
of course is needed for "top level" removal since it's the only way to conditionally express top level keys to return.
So whilst you can in theory use $cond
or $switch
to decide what to return, it's generally cumbersome and you would want something more flexible.
Filter the Empty Objects
db.junk.aggregate([
{ "$replaceRoot": {
"newRoot": {
"$arrayToObject": {
"$filter": {
"input": { "$objectToArray": "$$ROOT" },
"cond": { "$ne": [ "$$this.v", {} ] }
}
}
}
}}
])
This is where $objectToArray
and $arrayToObject
come into use. Instead of writing out the conditions for every possibly key we just convert the object contents into an "array" and apply $filter
on the array entries to decide what to keep.
The $objectToArray
translates any object into an array of documents representing each property as "k"
for the name of the key and "v"
for the value from that property. Since these are now accessible as "values", then you can use methods like $filter
to inspect the each array entry and discard the unwanted ones.
Finally $arrayToObject
takes the "filtered" content and translates those "k"
and "v"
values back into property names and values as a resulting object. In this way, the "filter" conditions removes any properties from the result object that did not meet the criteria.
A Return to $cond
db.junk.aggregate([
{ "$project": {
"a": { "$cond": [{ "$eq": [ "$a", {} ] }, "$$REMOVE", "$a" ] }
}}
])
MongoDB 3.6 introduces a new player with the $$REMOVE
constant. This is a new feature that can be applied with $cond
in order to decide whether or not to show the property at all. So that is another approach when of course the release is available.
In all those above cases the "a"
property is not returned when the value is the empty object that we wanted to test for removal.
{ "_id" : ObjectId("59adff0aad465e105d91374c"), "a" : 1 }
{ "_id" : ObjectId("59adff0aad465e105d91374d") }
More Complex Structures
Your specific ask here is for data containing nested properties. So continuing on from the outlined approaches we can work with demonstrating how that is done.
First some sample data:
{ "_id" : ObjectId("59ae03bdad465e105d913750"), "a" : 1, "info" : { "type" : 1, "qty" : 2, "detailed" : { "desc" : "this thing" } } }
{ "_id" : ObjectId("59ae03bdad465e105d913751"), "a" : 2, "info" : { "type" : 2, "qty" : 3, "detailed" : { "desc" : { } } } }
{ "_id" : ObjectId("59ae03bdad465e105d913752"), "a" : 3, "info" : { "type" : 3, "qty" : { }, "detailed" : { "desc" : { } } } }
{ "_id" : ObjectId("59ae03bdad465e105d913753"), "a" : 4, "info" : { "type" : { }, "qty" : { }, "detailed" : { "desc" : { } } } }
Applying the filter method
db.junk.aggregate([
{ "$replaceRoot": {
"newRoot": {
"$arrayToObject": {
"$filter": {
"input": {
"$concatArrays": [
{ "$filter": {
"input": { "$objectToArray": "$$ROOT" },
"cond": { "$ne": [ "$$this.k", "info" ] }
}},
[
{
"k": "info",
"v": {
"$arrayToObject": {
"$filter": {
"input": { "$objectToArray": "$info" },
"cond": {
"$not": {
"$or": [
{ "$eq": [ "$$this.v", {} ] },
{ "$eq": [ "$$this.v.desc", {} ] }
]
}
}
}
}
}
}
]
]
},
"cond": { "$ne": [ "$$this.v", {} ] }
}
}
}
}}
])
This needs more complex handling because of the nested levels. In the main case here you need to look at the "info"
key here independently and remove any sub-properties that do not qualify first. Since you need to return "something", we basically then need to remove the "info"
key itself when all of it's inner properties are removed. This is the reason for the nested filter operations on each set of results.
Applying $cond with $$REMOVE
Where available this would at first seem a more logical choice, so it helps to look at this from the most simplified form first:
db.junk.aggregate([
{ "$addFields": {
"info.type": {
"$cond": [
{ "$eq": [ "$info.type", {} ] },
"$$REMOVE",
"$info.type"
]
},
"info.qty": {
"$cond": [
{ "$eq": [ "$info.qty", {} ] },
"$$REMOVE",
"$info.qty"
]
},
"info.detailed.desc": {
"$cond": [
{ "$eq": [ "$info.detailed.desc", {} ] },
"$$REMOVE",
"$info.detailed.desc"
]
}
}}
])
But then you need to look at the output this actually produces:
/* 1 */
{
"_id" : ObjectId("59ae03bdad465e105d913750"),
"a" : 1.0,
"info" : {
"type" : 1.0,
"qty" : 2.0,
"detailed" : {
"desc" : "this thing"
}
}
}
/* 2 */
{
"_id" : ObjectId("59ae03bdad465e105d913751"),
"a" : 2.0,
"info" : {
"type" : 2.0,
"qty" : 3.0,
"detailed" : {}
}
}
/* 3 */
{
"_id" : ObjectId("59ae03bdad465e105d913752"),
"a" : 3.0,
"info" : {
"type" : 3.0,
"detailed" : {}
}
}
/* 4 */
{
"_id" : ObjectId("59ae03bdad465e105d913753"),
"a" : 4.0,
"info" : {
"detailed" : {}
}
}
Whilst the other keys are removed the "info.detailed"
still stays around because there is nothing that actually tests at this level. In fact you simply cannot express this in simple terms, so the only way to work around this is to evaluate the object as an expression and then apply additional filtering an conditions on each level of output to see where the empty objects still reside, and remove them:
db.junk.aggregate([
{ "$addFields": {
"info": {
"$let": {
"vars": {
"info": {
"$arrayToObject": {
"$filter": {
"input": {
"$objectToArray": {
"type": { "$cond": [ { "$eq": [ "$info.type", {} ] },"$$REMOVE", "$info.type" ] },
"qty": { "$cond": [ { "$eq": [ "$info.qty", {} ] },"$$REMOVE", "$info.qty" ] },
"detailed": {
"desc": { "$cond": [ { "$eq": [ "$info.detailed.desc", {} ] },"$$REMOVE", "$info.detailed.desc" ] }
}
}
},
"cond": { "$ne": [ "$$this.v", {} ] }
}
}
}
},
"in": { "$cond": [ { "$eq": [ "$$info", {} ] }, "$$REMOVE", "$$info" ] }
}
}
}}
])
That approach as with the plain $filter
method actually removes "all" empty objects from the results:
/* 1 */
{
"_id" : ObjectId("59ae03bdad465e105d913750"),
"a" : 1.0,
"info" : {
"type" : 1.0,
"qty" : 2.0,
"detailed" : {
"desc" : "this thing"
}
}
}
/* 2 */
{
"_id" : ObjectId("59ae03bdad465e105d913751"),
"a" : 2.0,
"info" : {
"type" : 2.0,
"qty" : 3.0
}
}
/* 3 */
{
"_id" : ObjectId("59ae03bdad465e105d913752"),
"a" : 3.0,
"info" : {
"type" : 3.0
}
}
/* 4 */
{
"_id" : ObjectId("59ae03bdad465e105d913753"),
"a" : 4.0
}
Doing it all in Code
So everything here really depends on latest features or indeed "coming features" to be available in the MongoDB version you are using. Where these are not available the alternate approach is to simply remove the empty objects from the results returned by the cursor.
It's often the most sane thing to do, and really is all you require unless the aggregation pipeline needs to continue past the point where the fields are being removed. Even then, you probably should be logically working around that and leave the final results to cursor processing.
As JavaScript for the shell you can use the following approach, and the principles essentially stay the same no matter which actual language implementation:
db.junk.find().map( d => {
let info = Object.keys(d.info)
.map( k => ({ k, v: d.info[k] }))
.filter(e => !(
typeof e.v === 'object' &&
( Object.keys(e.v).length === 0 || Object.keys(e.v.desc).length === 0 )
))
.reduce((acc,curr) => Object.assign(acc,{ [curr.k]: curr.v }),{});
delete d.info;
return Object.assign(d,(Object.keys(info).length !== 0) ? { info } : {})
})
Which is pretty much the native language way of stating the same as the examples above being that where one of the expected properties contains an empty object, remove that property from the output completely.