0

I am trying to implement in LIKE and COMMENT system like Instagram and Facebook in DynamoDB

I have 3 tables, user, photo and photo_likes

user table and photo table have user_id and photo_id keys.

and for photo_likes table, I have photo_id as a key and liked_by column where I store user_id as list.

So, If the photo_id 1 is liked by user_id 10, 35, I store it as:

| photo_id | liked_by |
|     1    |  {10,35} |

I am really confused if it is the right way to do it? or should I just insert a new row whenever their is a new like.

| photo_id | user_id  |
|     1    |    10    |
|     1    |    35    |
Neil Lunn
  • 148,042
  • 36
  • 346
  • 317

1 Answers1

1

I will provide the advantage and disadvantage of the above approaches. You may need to choose the correct approach based on your Query Access Patterns required for your application.

Approach 1:

| photo_id | liked_by |
|     1    |  {10,35} |

Advantage & Suggestion:-

  • My suggestion would be to store the liked_by as NS or SS rather than List. So that it will not have duplicates.
  • All the liked_by are present in the same item. This will help during retrieval and showing the results on GUI if required

Disadvantage:-

  • Can't create index (in case if required) on non-scalar data types (SS, NS or List)
  • Be aware of the maximum size (400 KB) of the item in DynamoDB. This may impact the data model if number of likes increase to unexpected values which could impact the size of the item

Approach 2:

| photo_id | user_id  |
|     1    |    10    |
|     1    |    35    |

You can define photo_id as partition key and user_id as sort key.

Advantage:-

  • You can create index on user_id if required
  • You can sort the data by user_id (if defined as sort key)
  • No need to worry about 400 KB item size

Disadvantage:-

  • Counting the likes should be done at client side by iterating the values. In approach 1, you can use some array.length to get the likes count
  • The number of items in the query result set would be high which could be difficult to deal with. This point depends on whether you have an use case to get the list of photos and its likes count or any similar sceanrio
notionquest
  • 37,595
  • 6
  • 111
  • 105
  • what about create a third table with the counting like, I mean when you give a like to a item then your insert that like into the like table and this table triggers a stream to update the third table count_likes, so for retrieval purpose it would be faster, isn't it ? – Juorder Gonzalez Jul 26 '20 at 00:39
  • Multiple tables is an anti-pattern in NoSQL. You should ideally have 1 table per application. That could cut your read cost by half or even more if indexes are done smartly. Also, you'll be able to support 10+ access patterns with only 1-2 global indexes. – Solo Sep 24 '20 at 21:45
  • And also, approach 2 doesn't have to be so expensive to count votes, just maintain a meta item in table with total like count. Could be done in application layer or with Lambda and DDB stream if it doesn't have to be 100% accurate, otherwise use transactions (people tend to make too big deal out of eventual consistency which is usually <1ms in case of DDB and accuracy - error margin would probably be <0.1%, unless you have some huge issues with application server). – Solo Sep 24 '20 at 21:48