0

I am new to riak and try to discover it's potential. At the moment i play around with the N,R,W settings to see the impact. What i miss here is to check, which data is stored on which node/partition.

Is there a way to ask

  • only one node, which data it stores?
  • every node, who stores specific data i want to have?

So if i could get an information like node 2 and 3 store the data i need, requesting that data would fail if i stop node 2 and 3.

stb
  • 3,405
  • 2
  • 17
  • 24

1 Answers1

2

I am not aware of any standard way to find out exactly on which nodes a particular record is stored (but would not rule out that there is one). It is however not usually something that one need to be concerned about.

Riak spreads out records across physical nodes based on consistent hashing in a way transparent to users and redistributes data across the cluster (in order to maintain n_val copies) if nodes are added or removed in an orderly fashion.

If a node goes down unexpectedly, changes to data held on this node will be tracked by other nodes during the downtime and passed on to the node once it comes back up.

This means that as long as you define N,R and W depending on the consistency requirements you have for the data, Riak will work to keep these even if nodes are added or removed.

In this way I believe Riak to some extent differs from databases that rely on sharding, as data there is more tied to specific nodes and not necessarily automatically redistributed to other nodes/shards.

Christian Dahlqvist
  • 1,665
  • 12
  • 9
  • Thanks, in the end it is really not that important where riak stores it. But it would be pretty strange if we couldnt find it out wouldnt it? To your sentence next to the last: If i got 3 nodes and stop one and `N=3`, i would be able to write ? – stb Nov 14 '12 at 15:25
  • If you have 3 nodes in the cluster and remove one node in a controlled way through riak-admin, Riak will move the data on this node to the others, and there will effectively be 2 copies on a single physical node. If one of the 3 nodes instead unexpectedly goes down, the other two nodes will temporarily keep track of changes and send this to the failed node once it comes up. In both these cases you are able to continue reading and writing (possibly depending on R and W values). – Christian Dahlqvist Nov 14 '12 at 15:39
  • 2
    Just as an FYI we recommend 5 nodes as a minimum (given the default N=3) to guarantee that one node doesn't have more than one copy of a given piece of data. And no, there's no way to find out what data is stored on any given node. – Brian Roach Nov 14 '12 at 18:11
  • Thanks for the help. I go on without worrying about where riak stores its stuff. (even if it feels hard) – stb Nov 16 '12 at 20:50