1

I am trying to create a redis based datastore with multiple fields that can be used to fetch the entity based on its value. The data would be something like;

Person<Entity>
 Name
 Address
 Purchases<Another Entity>
 Reviews<list of another Entity>

The same will also exist in other entites as this will be a many-to-many relationship between the different entities.

I am not considering traditional databases as I am looking for scalability and fault tolerance in such example. What I am creating is the following Hash of Entity id mapped to each entity object Sets containing the association of say Person to Purchases and another for Purchases to Person and so on - one for both sides of a many to many relationship.

Since this design will involve a lot of overhead, I suspect there is some flaw in keeping this unnormalized. As for the choice of using a memory store over a database, I am considering query response time to be of critical value. I am looking for suggestions about my design as I am implementing this example to learn how to handle bigdata challenges.

Sumit Bisht
  • 1,507
  • 1
  • 16
  • 31

1 Answers1

4

I am looking for suggestions about my design as I am implementing this example to learn how to handle bigdata challenges.

On what basis do you believe your challenges are Big Data? How much data we talking about? You need to ask yourself that question first before discounting relational databases as a solution that may likely meet your needs.

I am not considering traditional databases as I am looking for scalability and fault tolerance in such example.

Redis and relational databases have the same scalability issue; they don't scale well horizontally unless you either implement or use a custom sharding technique. Redis Cluster is meant to address this, but it's a work in progress and not yet production ready, in the meantime you can use twemproxy. Developed by Twitter, it's a proxying solution to distribute keys across a cluster of redis servers.

I am trying to create a redis based datastore with multiple fields that can be used to fetch the entity based on its value.

Redis is not designed to query based on values, period; read up on this and this to better understand why.

Community
  • 1
  • 1
raffian
  • 31,267
  • 26
  • 103
  • 174
  • Thanks @raffian for clarifying things and mentioning about twemproxy. The initial data that I had in mind was around 10 million records which could easily be scaled up. I was looking for fastest lookup possible while maintaining a custom solution. – Sumit Bisht Jul 28 '13 at 13:17