I am setting up a Ceph cluster. The client is asking for it to be done in virtual machines, one hypervisor / VM per server. Given my previous (minor) experience with virtual machines, I wonder if this will be a problem (hypervisors abstracting hardware and using their built-in drivers, etc.) Are my concerns justified? Will there likely be performance penalties when putting a VM between the hardware and a single virtual machine running an I/O (disk and disk controller) and network intensive application such as Ceph?
-
2Please provide reasons for the downvotes - perhaps I can phrase the question better. – Turtle V. Rabbit Nov 17 '16 at 21:12
-
1Because it's all opinion based. But I'll attempt to answer it because I've done a lot with ceph before. – hookenz Nov 17 '16 at 21:14
-
2@Matt - forgive my ignorance - is it opinion based whether running Ceph (or a similar product) in a VM would affect its performance? It seems like that's something that could be quantified. – Turtle V. Rabbit Nov 17 '16 at 21:15
-
It's something that could be quantified with data, as is the intrinsic nature of quantification. Until then, you're making wild assumptions. Whether this works depends on your hypervisor, network configuration, hardware configuration, and external workload. The thing to do would be to test this solution in a staging environment to see if it will work for you. In general, there's nothing inherently wrong with virtualizing this workload, but it depends on what you want out of it and how you're going to use it within your infrastructure. You need data to make these decisions that we don't have. – Spooler Nov 17 '16 at 21:20
-
@SmallLoanOf1M Thank you. So it can't be said anymore (as we used to say ~15 years ago), that putting an I/O intensive system in a VM, above a hypervisor, kills its performance (in general.)? – Turtle V. Rabbit Nov 17 '16 at 21:21
-
1Hypervisor I/O has come a long way since then. Use paravirtualized disks / controllers whenever you can, as emulated controllers are typically slow. Also think about why you're virtualizing this workload and what benefits you're trying to get out of virtualizing it. Live migration is probably not going to happen reasonably, as you would be storing data with this VM. However, it would make it a lot easier to deploy quickly and automatically if it were virtualized. You'll have to weigh the benefits with the data you see in testing. – Spooler Nov 17 '16 at 21:25
1 Answers
By the sounds of it... your client doesn't understand Ceph's requirements.
How many Virtual Machine hosts do they have?
Short answer:
Yes you could, but it's not recommended.
Long answer:
Yes it will work. But performance could be negatively impacted. And when I say negatively, I mean potentially really negatively.
Please read the Hardware Recommendations.
Details:
Ceph really expects to write it's data to dedicated disks controlled by storage nodes (OSD's). Putting another virtualization layer on top of that could really stuff the performance. Especially when it's got to share that disk with other VM's. But if it's for testing among a few team members, why not?
But for production. PLEASE DO NOT DO IT.
In addition to the OSD's, you need a minimum of 3 monitors. These should run on completely separate machines ideally. Because if they are not, then they are not resilient to going down when the host looses power or something are they? And if that happens you will loose access to the ceph cluster.
Another reason, Ceph is I/O intensive during reads/writes to OSD's. The more OSD's you have on the same physical host, the more congested the network interface on that host will become. Which is why you need to spread your load among many OSD's and many servers.
Now that said, I have virtualized the monitors and metadata servers before on Xen. But the Virtual Machine host I had was a beast and I gave plenty of resources to ceph. I also had big network bandwidth available on all these hosts. In fact, it was 56Gbit FDR Infiniband. So I can't really say what it was going to be like running it constrained, I didn't notice in this way.
I have also spun it up on Amazon EC2. But again, I used a higher spec setup for testing. It ran OKish on there, but we could see it getting hurt by the virtualization. That earlier testing was to see if we could get faster performance than Amazon's provisioned IOPS for less cost. We didn't go with that, but it was an interesting test.
Regarding paravirtual drivers etc.. Yes they help but SR-IOV is a better option in my opinion.
Summary:
- It's not recommended even though it works.
- OSD's shouldn't be virtualized really.
It'll be interesting to see if Sage Weil comments on here. If I'm not wrong he has done this before.

- 14,472
- 23
- 88
- 143
-
Well, yes but working SR-IOV network drivers are tricky. We can't get Mellanox CX4 work inside a VM reliably :((( – BaronSamedi1958 Dec 03 '16 at 08:38
-
1I did under Xen 4.1.2. Perhaps I'll get around to publishing my recipe some time. It wasn't straight forward. – hookenz Dec 18 '16 at 19:25
-
Since I wrote this. I spent a bit of time working with Ceph under docker. Still getting access natively to the hardware but spinning up the software inside a container. There were many valid reasons to do so. I think the docker containers now allows you to test docker locally with a file based OSD disk. That arrangement wouldn't be useful for production but for testing, very useful. https://github.com/ceph/ceph-container – hookenz Feb 13 '20 at 19:57
-
This is true! Most of the hardware storage appliances these days (NetApp, DellEMC & Pure) run virtualized or containerized services instead of an old school bare metal. – BaronSamedi1958 Feb 18 '20 at 12:00
-
@Matt what about ceph cluster on vms but dedicated physical osd device ? – WestFarmer May 16 '20 at 08:22
-
@westfarmer - I don't really see the point in that kind of configuration. But of you can come up with a good reason, try it. – hookenz May 17 '20 at 01:11