0

We've got a cluster of scylladb hosts (it's a cassandra-type database) running on i3 instances in amazon with the /var/lib/scylla/ folder mounted on a single nvme drive. I'm wondering whether there is any i/o performance gain to be expected by replacing this single drive with a two- (or multiple-) nvme drives that are configured as a RAID 0. In order words would striping give us a noticeable performance boost on this type of drive?

Michael Martinez
  • 2,645
  • 3
  • 24
  • 35
  • Just to throw,in a second option, since it typically is hard to get enough parallel commands on NVMe drives (and Striping won't help,here) amother option is to run two nodes on one machine, both writing t their own disk. I dont know about Scylla, it does help with Ceph. – eckes Apr 26 '17 at 02:27
  • Thanks for that suggestion, I'll run it by the team. We are currently cpu-bound, so we wouldn't be able to do that until we mitigate the resource (scylla binds a thread to each cpu and dedicates specific keys to each core). – Michael Martinez Apr 26 '17 at 13:27
  • I found this guy compared nvme raid 0 versus single disks on his gaming PC: http://www.eteknix.com/year-nvme-raid-0-real-world-setup/6/. Summary: raid0 noticeably better for sequential r/w; raid0 worse to somewhat better for random r/w (depending which benchmark tool he used); raid0 substantially worse for access times for reads. – Michael Martinez Apr 26 '17 at 13:59
  • @MichaelMartinez did you test it yourself? – Horaciux Sep 07 '18 at 20:35
  • 1
    @Horaciux Yes I did. Tested Raid 0 and Raid 5 configurations. Raid 0 was noticeably faster. Raid 5 did not see any performance improvement, and saw drastic performance deterioriation when the raid was in a rebuilding state. Decided that Raid 5 was not suitable for our production environment. Raid 0 would be suitable to replace an ephemeral nvme drive, but unsuitable for replacing an nvme-based EBS volume (you would lose all your data if a single drive failed.) – Michael Martinez Sep 09 '18 at 09:50

2 Answers2

0

Maybe. You'll want to benchmark it yourself to find out.

If there are no other bottlenecks, then more disks is more IOPS. But that's a big if. Only by testing yourself will you find what the next bottleneck is on that configuration.

I don't really see the point of RAID 0. Zero is the amount of data you will recover from that pair if one of them fails. For scale out databases, might as well add another instance in a simpler single drive configuration.

John Mahowald
  • 32,050
  • 2
  • 19
  • 34
  • We aren't concerned about data recovery from a single host. We can lose a host without losing data from the cluster. The point of RAID 0 is faster writes, because they can be parallelized. This was the expected gain in the old days where your disk had a spindle that wasted time seeking. I don't know whether this is the case with nvme drives. – Michael Martinez Apr 26 '17 at 13:33
0

Even without spindles, multiple drives responsible for different data stripes can increase your performance, and Scylla being what it is (an insanely fast database) you will want to squeeze every bit of performance you can out of those instances

dyasny
  • 18,802
  • 6
  • 49
  • 64
  • Yes see my own comment (the final one) to my question. Faster but unsuitable for EBS-backed nvme. – Michael Martinez Nov 20 '19 at 15:07
  • EBS is a generally bad idea for fast databases. i3 and the XXd instances are the better choice. The downside, of course, is that those fast NVMes are ephemeral, but with a replicated DB this shouldn't be too much of a concern. Are you still runing Scylla? – dyasny Nov 20 '19 at 15:45
  • We had a 12-node scylla cluster. I don't recall whether I ended up with ephemeral nvme or EBS-backed nvme on those, but we also had other clusterd stuff running on EBS-backed nvme and didn't have any performance issues. I did some benchmark comparisons between the two types of storage but I don't recall the results. – Michael Martinez Nov 21 '19 at 01:06
  • EBS is not horrible, just not comparable to the ephemeral NVMes – dyasny Nov 21 '19 at 19:19
  • Correct........ – Michael Martinez Nov 26 '19 at 14:40