-1

I am evaluating RIAK kV V2.1.1 on a local desktop using java client and a little customised version of the sample code

And my concern is I found it to be taking almost 920bytes per KV.

That's too steep. The data dir was 93 mb for 100k kvs and kept increasing linearly there after for every 100k Store ops. Is that expected.

        RiakCluster cluster = setUpCluster();
        RiakClient client = new RiakClient(cluster);
        System.out.println("Client object successfully created");
        Namespace quotesBucket = new Namespace("quotes2");
        long start = System.currentTimeMillis();
        for(int i=0; i< 100000; i++){
            RiakObject quoteObject = new RiakObject().setContentType("text/plain").setValue(BinaryValue.create("You're dangerous, Maverick"));
            Location quoteObjectLocation = new Location(quotesBucket, ("Ice"+i));
            StoreValue storeOp = new StoreValue.Builder(quoteObject).withLocation(quoteObjectLocation).build();
            StoreValue.Response storeOpResp = client.execute(storeOp);
        }
Kedar Parikh
  • 1,241
  • 11
  • 18
  • Are you testing on a single node and are you using the default n_val of 3? If so you have to take into account that each value is being written 3 times so each value is actually 920/3 bytes. Also consider that there is metadata being written to disk in addition to the value. – Craig Feb 09 '17 at 14:20
  • Yes a single node with all default configurations. Thanks for pointing out n_val of 3. So in principle when I use 3 nodes, each would have 1/3 data foot print. Correct? – Kedar Parikh Feb 10 '17 at 09:52
  • If you only have 3 nodes there is no guarantee that each copy will live on a unique Riak node. The recommendation with KV is that you have at least 5 nodes in a cluster because there isn't a mechanism that guarantees that each of the three copies will live on distinct nodes. With three nodes it is possible that you will find that 2 out of three copies of some objects will live on the same node. – Craig Feb 10 '17 at 13:40

1 Answers1

0

There was a thread on the riak users mailing list a while back that discussed the overhead of the riak object, estimating it at ~400 bytes per object. However, that was before the new object format was introduced, so it is outdated. Here is a fresh look.

First we need a local client

(node1@127.0.0.1)1> {ok,C}=riak:local_client().
{ok,{riak_client,['node1@127.0.0.1',undefined]}}

Create a new riak object with a 0-byte value

(node1@127.0.0.1)2> Obj = riak_object:new(<<"size">>,<<"key">>,<<>>).
#r_object{bucket = <<"size">>,key = <<"key">>,
          contents = [#r_content{metadata = {dict,0,16,16,8,80,48,
                                                  {[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                                                  {{[],[],[],[],[],[],[],[],[],[],[],[],...}}},
                                 value = <<>>}],
          vclock = [],
          updatemetadata = {dict,1,16,16,8,80,48,
                                 {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                                 {{[],[],[],[],[],[],[],[],[],[],[],[],[],...}}},
          updatevalue = undefined}

The object is actually stored in a reduced binary format:

(node1@127.0.0.1)3> byte_size(riak_object:to_binary(v1,Obj)).
36

That is 36 bytes overhead for just the object, but that doesn't include the metadata like last updated time or the version vector, so store it in Riak and check again.

(node1@127.0.0.1)4> C:put(Obj).
ok
(node1@127.0.0.1)5> {ok,Obj1} = C:get(<<"size">>,<<"key">>).
{ok, #r_object{bucket = <<"size">>,key = <<"key">>,
          contents = [#r_content{metadata = {dict,3,16,16,8,80,48,
                                                  {[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                                                  {{[],[],[],[],[],[],[],[],[],[],[[...]],[...],...}}},
                                 value = <<>>}],
          vclock = [{<<204,153,66,25,119,94,124,200,0,0,156,65>>,
                     {3,63654324108}}],
          updatemetadata = {dict,1,16,16,8,80,48,
                                 {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                                 {{[],[],[],[],[],[],[],[],[],[],[],[],[],...}}},
          updatevalue = undefined}}
(node1@127.0.0.1)6> byte_size(riak_object:to_binary(v1,Obj)).
110

Now it is 110 bytes overhead for an empty object with a single entry in the version vector. If a subsequent put of the object is coordinated by a different vnode, it will add another entry. I've selected the bucket and key names so that the local node is not a member of the preflist, so the second put has a fair probability of being coordinated by a different node.

(node1@127.0.0.1)7> C:put(Obj1).
ok
(node1@127.0.0.1)8> {ok,Obj2} = C:get(<<"size">>,<<"key">>).
{ok, #r_object{bucket = <<"size">>,key = <<"key">>,
          contents = [#r_content{metadata = {dict,3,16,16,8,80,48,
                                                  {[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                                                  {{[],[],[],[],[],[],[],[],[],[],[[...]],[...],...}}},
                                 value = <<>>}],
          vclock = [{<<204,153,66,25,119,94,124,200,0,0,156,65>>,
                     {3,63654324108}},
                    {<<85,123,36,24,254,22,162,159,0,0,78,33>>,{1,63654324651}}],
          updatemetadata = {dict,1,16,16,8,80,48,
                                 {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],...},
                                 {{[],[],[],[],[],[],[],[],[],[],[],[],[],...}}},
          updatevalue = undefined}}
(node1@127.0.0.1)9> byte_size(riak_object:to_binary(v1,Obj2)).
141

Which is another 31 bytes added for an additional entry in the version vector.

These numbers don't include storing the actual bucket and key names with the value, or Bitcask storing them again in a hint file, so the actual space on disk would then be 2x(bucketname size + keyname size) + value overhead + file structure overhead + checksum/hash size

If you're using bitcask, there is a calculator in the documentation that will help you estimate disk and memory requirements: http://docs.basho.com/riak/kv/2.2.0/setup/planning/bitcask-capacity-calc/

If you use eLevelDB, you have the option of snappy compression which could reduce the size on disk.

Joe
  • 25,000
  • 3
  • 22
  • 44