0

I am looking for a strategy to implement a shared RethinkDB cluster between customers with data isolation.

I would like to have multiple customers that would use a shared RethinkDB cluster, but I'm not sure about how to enforce the separation of the data. The customers requests are not to be trusted, but they should have full access to their own data, using the RethinkDB API.

It looks like a classic case for multi-tenant databases (I'm not sure), but I couldn't find enough information about how to do that with RethinkDB anyway.

Another idea : maybe I could also make a wrapper and prefix every table with a customerID, but would they be able to bypass that ? Is there a way to do something like r.db('customerID') that cannot be changed in the rest of the query ? Does that depend on a particular driver ?

How do I isolate the customers in RethinkDB ?

Note : I'm planning to use https://github.com/apa512/clj-rethinkdb in case it matters, but I'd accept any answer using JavaScript as well.

nha
  • 17,623
  • 13
  • 87
  • 133

1 Answers1

1

I think this is an open-ended question. RethinkDB authentication only require a single Auth Key. Basically, it share entire sever data to my knowledge, even across database.

So, in your case, I think it's best to run multiple docker to isolate data. Each customer has their own IP address to connect too, and their own auth key, and of course, their own data.

maybe I could also make a wrapper and prefix every table with a customerID, but would they be able to bypass that

Nothing prevent that, the Auth key is shared for entired cluster.

Is there a way to do something like r.db('customerID') that cannot be changed in the rest of the query

Not sure, but I don't think that's possible at this moment.

Looks like use separate RethinkDB instance, using a solution like Docker, is the way to go.

When you create a new customer, you start a RethinkDB docker container, map RethinkDB ports onto random ports available on host. And give those host:port URI string to customer.

You may want to register an account on compose.io, redistogo.com and try to guess how they do it, since each of their customer have their own data. And they are very similar to your situation I think.

kureikain
  • 2,304
  • 2
  • 14
  • 9
  • Thanks for your input. I'm still not sure about multi-tenant. Maybe because I've never done or seen that yet - and I'm not aware of the implications. And you propose docker, because rethinkDB doesn't have this built-in correct ? – nha Aug 30 '15 at 10:34
  • 1
    IMHO, I think let's use docker to have an isolated data. Unless RethinkDB has a better permission system so that we can have better granular permission like MySQL, for example. Otherwise, in same cluster, user can try to brute force/guessing database name/table name. When I tried to look at MongoLab, they returns me an instance look like this: – kureikain Aug 30 '15 at 20:16
  • 1
    (Sorry, I accident post the comment and cannot edit it). Let's say whenever you create a new customer, you launch a new RethinkDB docker container, fetch its ID, map it to an available port in the host, and paste it to your customer. When I tried to look at RedisToGo.Com, a service that give each customer a separate Redis instance. When I launch a new instance, I got back an URL like this: sole.redistogo.com:9747 with a different port. I think what you do is similar to RedisToGo. And RethinkDB authentication is inspired by Redis https://github.com/rethinkdb/rethinkdb/issues/266 – kureikain Aug 30 '15 at 20:30
  • you made very good points, thank you. That basically shifts the problem from pure development to devops I guess. Long time since I haven't played with docker, and the orchestration has been moving fast. But that's for another question then :) – nha Aug 30 '15 at 20:41