0

I've been struggling with the best way to structure my table. Its intended to have many, many GBs of data (I haven't been given a more detailed estimate). The table will be claims data (example here) with a partition key being the resourceType and a sort key being the id (although these could be potentially changed). The end user should be able to search by a number of attributes (institution, provider, payee, etc totaling ~15).

I've been toying with combining global and local indices in order to achieve this functionality on the backend. What would be the best way to structure the table to allow a user to search the data according to 1 or more of these attributes in essentially any combination?

Funsaized
  • 1,972
  • 4
  • 21
  • 41

1 Answers1

2

If you use resourceType as a partition key you are essentially throwing away the horizontal scaling features that DynamoDB provides out of the box.

The reason to partition your data is such that you distribute it across many nodes in order to be able to scale without incurring a performance penalty.

It sounds like you're looking to put all claim documents into a single partition so you can do "searches" by arbitrary attributes.

You might be better off combining your DynamoDB table with something like ElasticSearch for quick, arbitrary search capabilities.

Keep in mind that DynamoDB can only accommodate approximately 10GB of data in a single partition and that a single partition is limited to up to 3000 reads per second, and up to 1000 writes per second (reads + 3 * writes <= 3000).

Finally, you might consider storing your claim documents directly into ElasticSearch.

Mike Dinescu
  • 54,171
  • 16
  • 118
  • 151
  • +1 for DynamoDB/ElasticSearch suggestion. Also, if you are looking into simpler solution, you might try Aurora. It has limit of 64TB of data, which might be enough for you. – Tofig Hasanov Jul 10 '17 at 21:25
  • Ok I was thinking something similar... what would be the pros/cons of using ElasticSearch vs something like AWS CloudSearch? – Funsaized Jul 11 '17 at 12:39
  • I'm not sure of all comparison details but for what it's worth ElasticSearch enjoys a lot more popularity these days and AWS offers a managed ElasticSearch service as well. – Mike Dinescu Jul 11 '17 at 15:11