3

Let's say we have the following data model for a hypothetical forum:

// Post
{
    "_id": 1,
    "type": "post",
    "text": "",
    "timestamp": 1,
}

// Reply
{
    "_id": 2,
    "post_id": 1,
    "type": "reply",
    "text": "",
    "timestamp": 2,
}
  1. All replies are flat (there are no replies to replies, all replies are to a post)
  2. The stream of past posts and replies is unbounded

Ideally, I want to find the most recent threads without any replies.


So far I have these map/reduce functions:

map: function(doc) {
    if (doc.type == "post") {
        emit(doc._id, 0);
    }
    if (doc.type == "reply") {
        emit(doc.post_id, 1);
    }
},
reduce: function(keys, vals, rereduce) {
    return sum(vals);
}

If I run this and group by key, it gives me a list of all threads, where value is 0 for unreplied ones. So far, so good.

But,

  1. given that the stream is theoretically unbounded, I cannot sort or filter it in the application or CouchDB's list/filter functions, because they apply to the returned (and already truncated) dataset;
  2. changing the key or group level destroys the grouping I want, post ID has to be the group key.

Question: How do I find N most recent threads with no replies, how do I sort the reduced view by the timestamp of the post?

Easier question: How do I find at all if there are threads with no replies (boolean solution)? This implies filtering the reduced view, so that only zero-valued rows are left.

Alex B
  • 82,554
  • 44
  • 203
  • 280

1 Answers1

1

I think the easier implementation is for you to add additional field reply_count,
default to zero,
when a reply is replied,
reply_count+=1

come to search for post with zero replies,
the map function can be as simple as :

function (doc) {
  if (doc.type == "post")
  {
    emit([doc.reply_count, doc.timestamp], null);
  }
}

query :

descending=true
startkey=[0,9999999999]
endkey=[0,0]
include_docs=true
ajreal
  • 46,720
  • 11
  • 89
  • 119