Let's say we have the following data model for a hypothetical forum:
// Post
{
"_id": 1,
"type": "post",
"text": "",
"timestamp": 1,
}
// Reply
{
"_id": 2,
"post_id": 1,
"type": "reply",
"text": "",
"timestamp": 2,
}
- All replies are flat (there are no replies to replies, all replies are to a post)
- The stream of past posts and replies is unbounded
Ideally, I want to find the most recent threads without any replies.
So far I have these map/reduce functions:
map: function(doc) {
if (doc.type == "post") {
emit(doc._id, 0);
}
if (doc.type == "reply") {
emit(doc.post_id, 1);
}
},
reduce: function(keys, vals, rereduce) {
return sum(vals);
}
If I run this and group by key, it gives me a list of all threads, where value is 0 for unreplied ones. So far, so good.
But,
- given that the stream is theoretically unbounded, I cannot sort or filter it in the application or CouchDB's list/filter functions, because they apply to the returned (and already truncated) dataset;
- changing the key or group level destroys the grouping I want, post ID has to be the group key.
Question: How do I find N most recent threads with no replies, how do I sort the reduced view by the timestamp of the post?
Easier question: How do I find at all if there are threads with no replies (boolean solution)? This implies filtering the reduced view, so that only zero-valued rows are left.