Need advice on datastore selection

Question

Requirements

Must Have

Horizontally Scalable.
Fast Sorting on a secondary index.
Atomic update on a group of documents(Or simulate atomic update through versioning at table level). It’s very important that a group of documents(from a filter) are seen by the end user as updated together.
Should be easy to maintain lot of tables. Each table will store a category of items and each category has a separate schema.
Should be easy to add a composite index. The filtering criteria can change anytime(Queries on filters are not pre-defined). Better would be if a datastore allows fast filtering on all possible combinations of columns(comes by default with all possible composite indexes). Filters could be equal to or range queries.

Optional

In the above mentioned point on atomic update on a group of documents, we will usually update only two or three columns. Would be great if datastore supports partial document update without the need to reindex the whole document.

Not Required

High Availability
Strong consistency (Eventual consistency works)
High write throughput or low write latency

Query patterns

{
  "item_id": "1234",
  "brand": "adidas",
  "average_price": 123,
  "rate_of_sale": 123,
  "visual information": {
    "img_url": "http://imgsdsd",
    "color": "red"
  }
}

Get all adidas brand items within 100 to 200 price and sort filter set based on rate_of_sales.
Update all items rate_of_sales the next day based on a csv. It should be an atomic update or it should create a new table, copy the data over with new ros and delete older table and make application point to new table.

score 1 · Accepted Answer · answered Oct 01 '18 at 14:37

Since you want Horizontally scalability, Transactional store like Mysql doesn't work.

Since you want composite indices, key value stores like Redis, Aerospike and extended key values like HBase, Cassandra can be eliminated.

If you have many composite indices, MongodB is inefficient.

Elastic search or Solr supports all the use cases(except atomic bulk update)although this can be solved by using aliases if you are updating the entire index.

Solr is generally efficient in updating the document multiple times.

You can also consider using Mysql and do application level sharding if the number of composite indices are not many.

https://db-engines.com/en/ranking is a good site to compare datastores.

Need advice on datastore selection

1 Answers1