3

I need to track users relations in a social app, something like this:

UserA follow UserC, UserD, and UserE
UserZ follow UserC, UserD, and UserE
UserC follow UserA, UserD, and UserE

And so on.

First, I need a partition tolerant database so MySQL and its brothers are out of the game.

I look at couchdb but it creates a revision for each change, so, if your doc is like:

{
  uuid: uuid
  name: name,
  lastName: lastName
  follows: [ uuid1, uuid2, uuid3 ]
}

you will have this other revisions in the data base

(rev 1)
{
  uuid: uuid
  name: name,
  lastName: lastName
  follows: [ uuid1, uuid2 ]
}
(rev 2)
{
  uuid: uuid
  name: name,
  lastName: lastName
  follows: [ uuid1 ]
}

That is a lot of space, I know you can free this by some manual action but the problem don't disappear.

I look at Cassandra, and so far it looks like a good solution, it allows insertions with no extra space issues like couchdb. I can create a key space, then a column and then an store relations, something like this:

keyspace:{
  column:{
    ...
    uuidT:{ uuidA: timestamp, uuidB: timestamp, uuidZ }
    uuidF:{ uuidA: timestamp, uuidB: timestamp, uuidZ }
    uuidH:{ uuidA: timestamp, uuidB: timestamp, uuidZ }
    ...
  }
}

But I'm wondering if a graphical database is best for this.

Edit:

After surf for a answer I found this page that helps to chose a DB. http://nosql.findthebest.com/

Delta
  • 2,004
  • 3
  • 25
  • 33

2 Answers2

2

CouchDB is meant as an offline DB.

I would suggest looking into a graphDB, neo4j comes to mind. I was introduced to it via Mozilla labs in Toronto a few weeks ago, the people there told me it was the least painful graphDB to get up an running (you can apt-get/brew it). You can make arbitrary relationships in it, however it doesn't partition. If you want a DB that you can rely on and you want to do arbitrary relationships, maybe Titan is worth looking at.

PPPaul
  • 361
  • 1
  • 7
1

FWIW, in CouchDB I always use arrays of objects, rather than just arrays of IDs. Eg.

{
  uuid: uuid
  name: name,
  lastName: lastName
  follows: [ { _id: uuid1 }, { _id: uuid2 }, { _id: uuid3 } ]
}

This is for two reasons:

  1. It allows you, if you want, to easily add other join data to that object that you might want. Eg. { _id: uuid1, followed_on: "2011-10-22" }
  2. It fits in nicely with the include_docs=true options for grabbing the associated doc in your view queries.

Update

Hey, check it out, you can limit the number of revisions kept in the DB

smathy
  • 26,283
  • 5
  • 48
  • 68