I want to create an arabic dictionary database in DynamoDB. The database is primarily read-heavy, with only occasional writes to fix errors. I have 2 types of records, roots and nouns. The primary issue is that a single word can be pronounced in different ways, and can be interpreted as a root or noun, so it needs to have multiple entries. Ex:
root : ارج arija a (araj, اريج arīj) to be fragrant
noun1: ارج araj fragrance, sweet smell
noun2: ارج arij fragrant, sweet-smelling
I want to be able to (at a reasonable cost) support the following queries
- select all roots where root=?
- select all nouns where noun=?
- select all nouns where root=?
I'm thinking of implementing it in a single "Words" table like this: { PK int wordID (arbitrarily defined by dynamodb) string word string root bool isRoot JSON definition } Primary Key - wordID GSI1 - root, with isRoot as the sort ID GSI2 - word, with isRoot as the sort ID
Thus, query 1 and 3 could be handled by performing a query on GSI1 while specifying a sort key value of 1 and 0 respectively, and query 2 could be handled by performing a query on GSI2 with a sort key value of 0.
Is this the best way to design a DB for this type of application? I just thought it might not be optimal that I'm almost never going to do a query with my PK, as I'm only using the GSIs that I've defined.
I haven't implemented it yet but I'm considering using another type of DB on AWS if DynamoDB is not optimal for this situation where I am primarily querying for many rows at a time. The database size is approx 8k roots and 18k nouns