1

I'm trying to build an Instant Messaging functionality in my app as part a bigger project.

  • Chats can have more than 2 participants (group chats)
  • If participant A delete a message, it still should be visible to participant B (that's why I used the Message Participants table)
  • Same applies to Conversation.
  • By same logic, if all participants delete the conversation/message, it should be erased from DB.

l

Questions :

  1. I'm afraid that this schema is too cumbersome, meaning that the queries will be too slow once the app gets certain traffic mark (1k active users ? I'm guessing)

  2. Message Participants will have multiple records for each message - one for each participants in the chat. Instant Messaging means it will involve those writes with very tight timings. Wouldn't that be a problem?

  3. Should I add a layer of Redis DB, to manage a chat's active session's messaging? it will store the recent messages, and actively sync the PostgreSQL db with those messages (perhaps with Async transactions functionality that postgresql has?)

UPDATED schema :

enter image description here

  • I would also gladly hear ideas for having a "read" status functionality. I'm assuming it's much more complex with Group chats, so at least offering that for 1:1 chats would be nice.
BVtp
  • 2,308
  • 2
  • 29
  • 68

1 Answers1

0

I am a little confused by your diagram. Shouldn't the Conversation Participants be linked to the Conversations instead of the Message? The FKs look all right, just the lines appear wrong.

I wouldn't be worried about performance yet. The Premature Optimization Anti-Pattern warns us not to give up a clean design for performance reasons until we know whether we are going to have a performance problem. You anticipate 1000 users - that's not much for a modern information system. Even if they are all active at the same time and enter a message every 10 seconds, this will just mean 100 transactions per second, which is nothing to be afraid of. Of course, I don't know the platform on which you are going to run this. But it should be an easy task to set up those tables and write a simple test program that inserts those records as fast as possible.

Your second question makes me wonder how "instant" you expect your message passing to be. Shall all viewers of a message receive each keystroke of the text within a millisecond? Or do they just need to see each message appear right after it was posted? Anyway, the limiting factor for user responsiveness will probably be the network, not the database.

Maybe this is not mainly a database design issue. Let's assume you will have a tremendous rate of postings and viewings. But not all conversations will be busy all the time. If the need arises - but not earlier - it might be necessary to hold the currently busy conversations in memory and using the database just as a backup for future times when they aren't busy any more.

Concerning your additional comments:

100k users: This is a topic not for this forum, but concerning business development of a startup. Many founders of startup companies imagine huge masses of users being attracted to their site, while in reality most startups just fail or only reach very few. So beware of investments (in money, but also in design and implementation effort) that will only pay in the highly improbable case that your company will be the next Whatsapp.

In case you don't really anticipate such masses of users but just want to imagine this as a programming exercise, you still have a difficult task. You won't have the platform to simulate the traffic, so there is no way to make measurements on where you actually have a performance problem to solve. That's one of the reasons for the Premature Optimization warning: Unless you know positively where you have a bottleneck, you - and all of us - will be just guessing and probably make the wrong decisions.

Marking a message as read is easy: Introduce a boolean attribute read at Message Participants, and set it to true as soon as, well, the user has read the message. It's up to your business requirements in which cases and to whom you show this.

TAM
  • 1,731
  • 13
  • 18
  • thank you for the help! Yes, a small oversight, you're totally right. I've added the updated schema to the post. (I also added a FK from Attachment to Users). Now, regarding the 1000 users - I merely used that as a guess for when I would hit performance issues. I'm actually aiming to build something for around 100k users. It sounds ambitious a bit, but I'm also using this as a project to learn how to structure a quality database. I definitely understand that there's not point in something like Facebook scale, but 100k should be reasonable – BVtp Oct 01 '19 at 16:04
  • regarding the IM - your second option is what I meant by 'instant' : they just need to see each message appear right after it was posted. It should be like Whatsapp in that sense (a "they're typing" option could be nice, but not essential. Unless you could suggest a way to implement it? ). And I also want to add the ability to mark the messages as 'read' to the other side (only in 1:1 chat, because in group chats I assume it would be more problematic). Could you please suggest a way for this? – BVtp Oct 01 '19 at 16:05
  • Also, regarding my consideration of adding Redis, I read somewhere that Postgresql made the message sending "lag" a bit to other side, a few seconds.. it's problematic for an IM. that's why I'm thinking of adding a Redis layer, it would store the latest X messages (100 perhaps?) and then the fetching would be quicker even when you just logged in to the socket, and not only while you're actively in it. But I might be wrong, so I'll gladly hear your opinion. Btw, the backend platform would be nodejs. My other concern was if a lot of users send a lot of messages,messages table would become huge. – BVtp Oct 01 '19 at 16:10