I have to implement a system where a tenant can store multiple key-value stores. one key-value store can have a million records, and there will be multiple columns in one store
[Edited] I have to store tabular data (list with multiple columns) like Excel where column headers will be unique and have no defined schema. This will be a kind of static data (eventually updated). We will provide a UI to handle those updates. Every tenant would like to store multiple table structured data which they have to refer it in different applications and the contract will be JSON only.
For Example, an Organization/Tenant wants to store their Employees List/ Country-State List, and there are some custom lists that are customized for the product and this data is in millions.
A simple solution is to use SQL but here schema is not defined, this is a user-defined schema, and though I have handled this in SQL, there are some performance issues, so I want to choose a NoSQL DB that suits better for this requirement.
Design Constraints:
- Get API latency should be minimum.
- We can simply assume the Pareto rule, 80:20 80% read calls and 20% write so it is a read-heavy application
- Users can update one of the records/one columns
- Users can do queries based on some column value, we need to implement indexes on multiple columns.
- It's schema-less so we can simply assume it is NoSql, SQL also supports JSON but it is very hard to update a single row, and we can not define indexes on dynamic columns.
- I want to segregate key-values stores per tenant, no list will be shared between tenants.
One Key Value Store :
Another key value store example: https://datahub.io/core/country-list
I am thinking of Cassandra or any wide-column database, we can also think of a document database (Mongo DB), every collection can be a key-value store or Amazon Dynamo database
Cassandra: allows you to partition data by partition key and in my use case I may want to get data by different columns in Cassandra we have to query all partitions which will be expensive.