1

I've got 3 types of documents in my db:

{
param: "a",
timestamp: "t"
} (Type 1)

{
param: "b",
partof: "a"
} (Type 2)

{
param: "b",
timestamp: "x"
} (Type 3)

(I can't alter the layout...;-( )

Type 1 defines a start timestamp, it's like the start event. A Type 1 is connected to several Type 3 docs by Type 2 documents.

I want to get the latest Type 3 (highest timestamp) and the corresponding type 1 document.

How may I organize my Map/Reduce?

Ladislav Mrnka
  • 360,892
  • 59
  • 660
  • 670
philipp
  • 93
  • 8
  • I'm having a very hard time wrapping my mind around your data structure, could you include some more realistic test data? I would really like to help, but I don't really understand enough to be much good. – Dominic Barnes Mar 30 '11 at 14:11

2 Answers2

0

Easy. For highly relational data, use a relational database.

JasonSmith
  • 72,674
  • 22
  • 123
  • 149
  • Nice answer. Isn't it possible to do this with CouchDB? I know that it isn't optimal, but I want to learn ;-) – philipp Mar 31 '11 at 11:35
  • Well, I'm having a better day today so I'll make my best CouchDB shot in a separate answer. For learning purposes, that's great! But still, when I say "I cannot change the data structure," that is a red-flag reminding me to consider a schema and SQL database. – JasonSmith Apr 01 '11 at 04:14
0

As user jhs stated before me, your data is relational, and if you can't change it, then you might want to reconsider using CouchDB.

By relational we mean that each "type 1" or "type 3" document in your data "knows" only about itself, and "type 2" documents hold the knowledge about the relation between documents of the other types. With CouchDB, you can only index by fields in the documents themselves, and going one level deeper when querying using includedocs=true. Thus, what you asked for cannot be achieved with a single CouchDB query, because some of the desired data is two levels away from the requested document.

Here is a two-query solution:

{
    "views": {
        "param-by-timestamp": {
            "map": "function(doc) { if (doc.timestamp) emit(doc.timestamp, [doc.timestamp, doc.param]); }",
            "reduce": "function(keys, values) { return values.reduce(function(p, c) { return c[0] > p[0] ? c : p }) }"
        },      
        "partof-by-param": {
            "map": "function(doc) { if (doc.partof) emit(doc.param, doc.partof); }"
        }       
    }   
}

You query it first with param-by-timestamp?reduce=true to get the latest timestamp in value[0] and its corresponding param in value[1], and then query again with partof-by-param?key="<what you got in previous query>". If you need to fetch the full documents together with the timestamp and param, then you will have to play with includedocs=true and provide with the correct _doc values.

Amir
  • 631
  • 6
  • 17