0

I am working on an application which will run on an EC2 webserver. The app will use two databases, one small which manages the site/app and holds all such related data, the second will be a large database dedicated to the storage and retrieval of float values.

Even after reading through some of the AWS docs, I am still confused as to the best approach. Should this database be stored on an attached EBS volume or use SimpleDB, RDS or S3?

How would one store a database in S3? I have read that SimpleDB is a great solution for simple databases (which this database will be (no relationships, each table has only a id and value column)), however SimpleDB is not a great solution for large databases. Apparently storing metadata in SimpleDB then the main data in S3 is an approach, but I don't really understand how that works in the context of a database.

RDS sounds overkill as my database does not have relationships etc, whilst EBS can only be attached to one instance, isn't scalable (I don't think), and I think costly for large amount of data when compared to S3.

I would like a little explanation to fill the obvious gaps in my knowledge, but the main goal is to find the best solution for my needs, with my needs primarily being cheap storage and quick data retrieval.

Wenfang Du
  • 8,804
  • 9
  • 59
  • 90
Dan
  • 58
  • 5
  • "RDS sounds overkill as my database does not have relationships etc" maybe you should expand more on this. What exactly is your "database" if it doesn't have relationships, etc? It's it just a bunch of flat files? I would also suggest ignoring simpledb and looking at the newer DynamoDB service instead. S3 will probably be a horrible place to store your database unless you just want archive/backup storage, because you can't randomly access the content of files in S3. – Mark B Sep 30 '17 at 13:00
  • Hi Mark, thanks for getting back to me. I am looking for a solution for an operational in-use db, so it sounds like S3 is not the way forwards. The database will consist of eventually thousands of tables, each table with an id column and a float data column only. I will only create more tables of this structure, add data to tables and select data from tables, never joining or anything. The database as a whole will store millions of these float values and will be queried frequently. My main concerns are cost and speed. Thanks again in advanced. – Dan Sep 30 '17 at 13:07
  • Perhaps I don't even need a database, would flat files be cheaper and faster? – Dan Sep 30 '17 at 13:08
  • You need to define what "fast" is to you, and what "cheap" is to you. You usually can't have the "fastest" and the "cheapest" at the same time. In other words you will have to pay for speed. That being said, I think DynamoDB is the correct solution for you given your requirements. – Mark B Sep 30 '17 at 13:39
  • I understand, but as I am only planning at this point, I have no metrics to go by. I think Dynamo sounds like the best approach. – Dan Sep 30 '17 at 13:45

1 Answers1

1

Based on your requirements, AWS DynamoDB suits best for your scenarios.

  • DynamoDB is a key value and document database built for higher scale and fully managed by AWS.
  • It fits well for querying indexable data which doesn't have relationships and complex transactions.
  • If the data record size is larger than 400KB, you can store the data in AWS S3 and keep the metadata in DynamoDB for querying.
  • Since its replicated across multiple facilities, DynamoDB provides high availability.
  • It also supports auto scaling to handle larger loads.
Ashan
  • 18,898
  • 4
  • 47
  • 67
  • Hi Ashan, thanks for this, it sounds promising. I presume this is the fastest and cheapest solution for my needs then? Are you able to provide me a link to the perfect article in order for me to learn how to implement this? – Dan Sep 30 '17 at 13:13
  • How do you plan to to use DynamoDB, is it with a serverless project with API Gateway and Lambda or in a web server with your server code? What is the language preference? Any other constraints such as compliance or regulatory? – Ashan Sep 30 '17 at 13:38
  • I would access the data from a PHP application on a linux webserver on EC2. No other constraints. – Dan Sep 30 '17 at 13:42
  • You can use the PHP SDK for DynamoDBClient http://docs.aws.amazon.com/aws-sdk-php/v3/api/class-Aws.DynamoDb.DynamoDbClient.html . Will also require to allow EC2 Role accessing DynamoDB in Policy. Few examples found in internet http://takeshiyako.blogspot.com/2015/10/php-with-amazon-dynamodb.html – Ashan Sep 30 '17 at 13:51