0

I have a Ceph Cluster with 500 TB of capacity. I want to create cache tier for it.

I have a 20TB SSD. Is that sufficient for use 20 TB for cache 500TB? what is the best way to calculate it?

0xF2
  • 314
  • 3
  • 17
yasin lachini
  • 5,188
  • 6
  • 33
  • 56

2 Answers2

2

Also please consider the following:

It is important to understand the type or workload (I/Os) will you receive. Usually when you have many customers on the cluster the cache fills up quickly and it will not help as much.

  • what is your suggestion to have a best cluster? – yasin lachini Jun 11 '20 at 06:26
  • 1
    Well in my opinion (and some people will throw bricks at me) I would split the cluster in the following: 1. SSD disks and SATA disks should be on separated hardware and separated ceph clusters. Ceph SSD cluster and Ceph SATA cluster 2. RBD Pools for each type of clients (RBD1 for database, RBD2 for pre-prod database, RBD3 for other customers.) that is if you use RBD; Here you can set up replication for each pool as you would like. 3 Replication is best to have it on rack. You can use server replication but rack i think is best. (assuming you have the servers in the same DC) – Bogdan Adrian Velica Jun 11 '20 at 21:34
1

Traditionally, we recommend one SSD cache drive for 5 to 7 HDD. properly, today, SSDs are not used as a cache tier, they cache at the Bluestore layer, as a WAL device.

Depending on the use case, capacity of the Bluestore Block.db can be 4% of the total capacity (Block, CephFS) or less (Object store).

Especially for a small Ceph cluster (less than 1 PB raw storage), it makes sense to consider all-SSD storage, particularly for the Block use case, as prices of solid state and HDD are not that distant for medium capacity drives.

0xF2
  • 314
  • 3
  • 17
  • thanks. I create data.cache pool and use for cache tire of data for my object storage. Is this good? you means that I should use my SSD for journal? which one has better performance ?journal or a cache tier pool? I thought cache tier has better performance. but my impression from your answer is that the journal is better to use. – yasin lachini Apr 20 '20 at 21:09
  • Can you please point out a reference for your claim? Thank you. – itsafire Apr 23 '20 at 12:34
  • @itsafire https://docs.ceph.com/docs/nautilus/rados/configuration/bluestore-config-ref/ – 0xF2 Apr 25 '20 at 05:55
  • @yasinlachini the WAL device is the best use of SSD caches. – 0xF2 Apr 25 '20 at 06:00
  • https://ceph-users.ceph.narkive.com/roHS8cpX/ssds-for-journals-vs-ssds-for-a-cache-tier-which-is-better – yasin lachini Apr 25 '20 at 16:43
  • @itsafire It said" However for reads (of hot data) I would expect a SSD Cache Tier to be faster/better. That's because, in the case of having journals on SSDs, even if data is in the journal, it's always read from the (slow) backing disk anyway, right? But with a SSD cache tier, if the data is hot, it would be read from the (fast) SSD." – yasin lachini Apr 25 '20 at 16:45
  • from @yasinlachini's link: Cache tiers (currently) work only well if all your hot data fits into them. In which case you'd even better off with with a dedicated SSD pool for that data. So (1) set up Bluestore with WAL cache devices on SSD first, and (2) if you still have SSD capacity left, create a dedicated SSD pool. – 0xF2 Apr 26 '20 at 03:25