1

I am working on a messaging system using node.js + cradle and couchdb.

When a user pulls a list of their messages, I need to pull the online status of the user that sent them the message. The online status is stored in the user document for each registered user, and the message info is stored in a separate document.

Here is the only way I can manage to do what I need, but its hugely inefficient

privatemessages/all key = username of the message recipient

db.view('privatemessages/all', {"key":username}, function (err, res) {
    res.forEach(function (rowA) {
        db.view('users/all', {"key":rowA.username}, function (err, res) {
            res.forEach(function (row) {
                result.push({onlinestatus:row.onlinestatus, messagedata: rowA});
            });
        });
    });

    response.end(JSON.stringify(result));
});

Can someone tell me the correct way of doing this?

Thank you

yojimbo87
  • 65,684
  • 25
  • 123
  • 131
Brian
  • 137
  • 1
  • 9

3 Answers3

1

I think your system could use an in memory hashmap like memcached. Each user status entry would expire after a time limit. Mapping would be [user -> lasttimeseen]

If the hashmap contains the user, then the user is online. On some certain actions, refresh the lasttimeseen.

Then instead of pinging the whole world each time, just query the map itself and return the result.

Nicolas Modrzyk
  • 13,961
  • 2
  • 36
  • 40
1

Your code could return empty result because you are calling response at the time when user statuses may not yet be fetched from DB. Other problem is that if I received multiple messages from the same user, then call for his status may be duplicit. Below is a function which first fetch messages from DB avoiding duplicity of users and then get their statuses.

function getMessages(username, callback) {
    // this would be "buffer" for senders of the messages
    var users = {};
    // variable for a number of total users I have - it would be used to determine
    // the callback call because this function is doing async jobs
    var usersCount = 0;
    // helpers vars
    var i = 0, user, item;

    // get all the messages which recipient is "username"
    db.view('privatemessages/all', {"key":username}, function (errA, resA) {
        // for each of the message
        resA.forEach(function (rowA) {
            user = users[rowA.username];
            // if user doesn't exists - add him to users list with current message
            // else - add current message to existing user
            if(!user) {
                users[rowA.username] = {
                    // I guess this is the name of the sender
                    name: rowA.username,
                    // here will come his current status later
                    status: "",
                    // in this case I may only need content, so there is probably 
                    // no need to insert whole message to array
                    messages: [rowA]
                };
                usersCount++;
            } else {
                user.messages.push(rowA);
            }
        });

        // I should have all the senders with their messages
        // and now I need to get their statuses
        for(item in users) {
            // assuming that user documents have keys based on their names
            db.get(item, function(err, doc) {
                i++;
                // assign user status
                users[item].status = doc.onlineStatus;
                // when I finally fetched status of the last user, it's time to
                // execute callback and rerutn my results
                if(i === usersCount) {
                    callback(users);
                }
            });
        }
    });
}

...

getMessages(username, function(result) {
    response.end(JSON.stringify(result));
});

Although CouchDB is a great document database you should be careful with frequent updates of existing documents because it creates entirely new document version after each update (this is because of it's MVCC model which is used to achieve high availability and data durability). Consequence of this behavior is higher disk space consumption (more data/updates, more disk space needed - example), so you should watch it and run database consumption accordingly.

yojimbo87
  • 65,684
  • 25
  • 123
  • 131
  • Thanks for the response, this cleared up a lot for me. Is there anyway to have couchdb produce a document that includes the user data in the original call, instead of pulling the data for each unique user? – Brian Aug 11 '11 at 19:36
  • I think it could be possible to include some of the user data into your message view (maybe either with reduce or [view collation](http://wiki.apache.org/couchdb/View_collation)), but I'm not sure if those data would be updated if the original user document is changed. I would probably rather stay with standalone queries for each unique user. If you need speed you can combine CouchDB with Redis for example which is a very good combination of speed and reliable data persistence. – yojimbo87 Aug 11 '11 at 20:01
  • Thanks again, I think your method will work, its a lot less hard on the database than I thought it would be. I do have one more problem though, it seems like the code above will complete the callback before it has grabbed the data, is this correct? – Brian Aug 12 '11 at 02:06
  • If you mean the code which I posted in this answer it should execute the callback after the data are loaded from DB. – yojimbo87 Aug 12 '11 at 07:15
0

I'm reminded of this presentation:

Databases Suck for Messaging

And its quote from Tim O'Reilly:

"On monday friendfeed polled flickr nearly 3 million times for 45000 users, only 6K of whom were logged in. Architectural mismatch."

As pointed out in the other answers, updates in CouchDB are expensive and should be avoided if possible, and there's probably no need for this data to be persistent. A cache or messaging system may solve your problem more elegantly and more efficiently.

MetaThis
  • 101
  • 3