9

I need to setup a data storage which can store PB level of files (files are mostly small json, images and csv files, but some of them can be ~100MB binary files).

I am looking into distributed data storage which is master-less and no-single-point-of-failure.

And I found Riak and GlusterFS.

I want to ask anyone of you have used both of them before?

I know that there interface (DB/Map) is very different. But seems to me that they are both use hashing and similar distributed tech. Will they have similar performance, consistency and availability?

Eric Fong
  • 815
  • 1
  • 8
  • 17
  • I've no experience about Riak but GlusterFS performance is probably going to end up leaving you hoping for more. GlusterFS requires use of RDMA links between all servers and clients for high performance because GlusterFS really likes to do syncronized operations a lot. Other than the performance, GlusterFS is a pretty good system. GlusterFS is especially slow to do directory listings with lots of files. – Mikko Rantalainen Sep 20 '19 at 07:19

3 Answers3

4

We are running a 17 node (24GB RAM, 2T disk) Riak cluster with a Bitcask backend, storing around 1 billion 3k objects. This setup is performant but very resource intensive. We are considering moving away from Riak to GlusterFS as performance is not that important for us. Perhaps using LevelDB as a backend would also mitigate our worries.

ATM the self healing properties of Riak seem stronger and the configuration seem a tad easier. In your case I'd be more comfortable storing 100MB files on GlusterFS.

Reza S
  • 9,480
  • 3
  • 54
  • 84
harm
  • 10,045
  • 10
  • 36
  • 41
0

The choice depends mostly on requirements. Generally I'd recommend Riak if you do not actually need a real filesystem (with mounting points, ACLs management and so on) and just gonna use or serve files programatically, and GlusterFS otherwise.

Ivan Blinkov
  • 2,466
  • 15
  • 17
0

Storing larger files like the 100MB files you mention would not be the right choice for plain OSS Riak.

What you'd really should use in that case is the newly announced RiakCS http://basho.com/products/riakcs/ from Basho instead.

Spyplane
  • 435
  • 4
  • 7