12

The idea is to return a kind of row number to a mongodb aggregate command/ pipeline. Similar to what we've in an RDBM.

It should be a unique number, not important if it matches exactly to a row/number.

For a query like:

[ { $match: { "author" : { $ne: 1 } } }, { $limit: 1000000 } ]

I'd like to return:

{ "rownum" : 0, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }
{ "rownum" : 1, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 }
{ "rownum" : 2, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
{ "rownum" : 3, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 }
{ "rownum" : 4, "title" : "Iliad", "author" : "Homer", "copies" : 10 }

Is it possible to generate this rownum in mongodb?

Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
ic3
  • 7,917
  • 14
  • 67
  • 115
  • 1
    No it is not possible. You possibly would be better off explaining *"why you think you need this"*. It's typically used in SQL implemations for windowed paged results, such as when items are sorted. There are likely other options if you rather explained your use case to solve. – Blakes Seven Feb 03 '16 at 10:35
  • It's an improvement - using string for id is really a bad idea when data is massive, > 100mio, on a BI tool that takes data from a MongoDB datasource. No really workaround unless... – ic3 Feb 03 '16 at 10:39
  • Adding numbers to rows in MongoDB ( which you cannot do anyway ) would mean passing though all results/data ( presumably before selecting a "page" ) and assigning one at a time. So in no way would is be possibly an improvement due to the way the architecture does things. I was giving you the option. 1. Accept the "No it cannot be done". 2. Explain your use case and possibly be given an alternate approach that is better than what you have been able to think of so far. As I see it, one is a dead end while the other might just go somewhere. – Blakes Seven Feb 03 '16 at 10:43

3 Answers3

11

Not sure about the performance in big queries, but this is at least an option.

You can add your results to an array by grouping/pushing and then unwind with includeArrayIndex like this:

[
  {$match: {author: {$ne: 1}}},
  {$limit: 10000},
  {$group: {
    _id: 1,
    book: {$push: {title: '$title', author: '$author', copies: '$copies'}}
  }},
  {$unwind: {path: '$book', includeArrayIndex: 'rownum'}},
  {$project: {
    author: '$book.author',
    title: '$book.title',
    copies: '$book.copies',
    rownum: 1
  }}
]

Now, if your database contains a big amount of records, and you intend to paginate, you can use the $skip stage and then $limit 10 or 20 or whatever you want to display per page, and just add the number from the $skip stage to your rownum and you'll get the real position without having to push all your results to enumerate them.

zag2art
  • 4,869
  • 1
  • 29
  • 39
Marc Enriquez
  • 146
  • 1
  • 3
  • 3
    That looks terribly inefficient, but it's what I needed and answers the question. – villasv May 05 '17 at 00:45
  • 1
    Careful, https://docs.mongodb.com/manual/reference/operator/aggregation/group/#pipe._S_group - $group does not order its output documents. – barrypicker Oct 22 '19 at 18:41
5

Starting in Mongo 5, it's a perfect use case for the new $setWindowFields aggregation operator and its $documentNumber operation:

// { x: "a" }
// { x: "b" }
// { x: "c" }
// { x: "d" }
db.collection.aggregate([
  { $setWindowFields: {
    sortBy: { _id: 1 },
    output: { rowNumber: { $documentNumber: {} } }
  }}
])
// { x: "a", rowNumber: 1 }
// { x: "b", rowNumber: 2 }
// { x: "c", rowNumber: 3 }
// { x: "d", rowNumber: 4 }

$setWindowFields allows us to work for each document with the knowledge of previous or following documents. Here we just need the information of the place of the document in the whole collection (or aggregation intermediate result), as provided by $documentNumber.

Note that we sort by _id because the sortBy parameter is required, but really, since you don't care about the ordering of your rows, it could be anything you'd like.

Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
3

Another way would be to keep track of row_number using "$function"

[{ $match: { "author" : { $ne: 1 } }}  , { $limit: 1000000 },
{
    $set: {
      "rownum": {
        "$function": {
          "body": "function() {try {row_number+= 1;} catch (e) {row_number= 0;}return row_number;}",
          "args": [],
          "lang": "js"
        }
      }
    }
  }]

I am not sure if this can mess up something though!

Saurav Shah
  • 81
  • 1
  • 7