0

I have a db which contains documents, i.e.:

{ id: "first_document", document: [
{page: 1, content: "text"},
{page: 2, content: "text"},
{page: 3, content: "text"}]}, 

{ id: "second_document", document: [
{page: 1, content: "text"}, 
{page: 2, content: "text"}]}

and so on, with multiple documents of various sizes.

I need to find the average and the total number of pages across all documents.

I have done a few tries with different pipelines but can't seem to find the solution. Example of a basic test I did:

[{ "$unwind": "$document" }, 
{ "$group":{"_id": None,"average_pages": {"$avg" : "$document.page"}}}]

but this of course makes the average (and the sum) across all entries.

How do I make the average and sum only over the maximum page number? (in the example provided, the avg and the sum only between 3 and 2).

Expected output for this example:

avg = 2.5

sum = 5

FMarazzi
  • 583
  • 1
  • 5
  • 14

1 Answers1

0

Try with this

collection.aggregate([
{
 "$unwind": {
           path:"$document" ,
           preserveNullAndEmptyArrays:true
     } 
},
{
$group:{
    _id:"$_id",
    totalPage:{$sum:1},
    average:{$avg:"$document.page"}

    }
}
])
Ashwanth Madhav
  • 1,084
  • 1
  • 9
  • 21