0

I currently have a maintenance log, tracking jobs performed across various machines (date/running time/new parts etc). Currently, this is backed by a relational database, which automatically increments an id number for each job. I would like to move this database to DynamoDB, but I am unsure about how to structure the database, whilst keeping unique primary keys for each item.

My initial thought would be to have machine as the partition key, as the main query would be "show me all jobs for machine x", but then what do I use for the sort key? I have ruled out date, as a sort key, as there are often multiple jobs on the same machine each day. Could I get away with using the epoch timestamp of the time that the log is added to the database, even though that is not important data? Or maybe generate some kind of unique id?

LC1983
  • 240
  • 4
  • 12
  • 1
    The words `log` and `DynmaoDB` don't go well together. Why do you want to move to DynamoDB? Do you need fast LOOKUPs on the data? I would consider using Kinesis to S3 and then using Athena to query the logs. – Guy Oct 10 '22 at 22:17
  • 1
    Is that one query your only query? The answer in schema design depends on the access patterns. – hunterhacker Oct 10 '22 at 23:05
  • @Guy - thanks, the main motivator is that I want something serverless in the cloud. It is to power a web frontend. It is also a learning experience, as I have never used DynamoDB/NoSQL before. – LC1983 Oct 11 '22 at 06:38
  • @hunterhacker - thank you, I would also like to use queries like "show me all new parts fitted to machine X" or "when was part Y last served on machine Z". – LC1983 Oct 11 '22 at 06:41
  • 1
    So then a timestamp sort key is important data because you’re asking for “new” parts plus time-oriented queries. Why not go with that design? Anything you can’t do? – hunterhacker Oct 11 '22 at 08:14
  • 1
    If you want serverless in the cloud, I would still recommend using Athena and not DynamoDB for this use case. – Guy Oct 12 '22 at 04:59

1 Answers1

0

The PK and SK must form a unique pair. In your case, the SK has to make it unique if the search pattern is only "Get all logs for this machine".

Just put something like a timestamp (if that is unique enough) in as an SK, and then retrieve the data by only using the PK. However as mentioned in the comments, perhaps DynamoDB is not the best place to store logs

Borislav Stoilov
  • 3,247
  • 2
  • 21
  • 46