We are a small startup building a chat service and we are having big trouble to figure out how to storage the chat history. We are also unsure about how necesary Tornado is in our scenario.
We are have a Django app running on Heroku, and we are not yet sure if we should implement a separate Tornado app which forwards messages to Pusher, so right now is the Django app who receives messages from clients and forwards them to Pusher channels.
Our initial architecture looks now like in the picture below:
We are using a PostgreSQL to storage user profiles, info about the chat rooms and so on. And we don't know what is now the best approach to store the chat messages. Can we use PostgreSQL also for this? is it possible to use Redis with persistence to store the whole chat history (even if the app grows a lot in number of users) or would be best to use Redis for the most recent messages and PostgreSQL for the whole rest of the history. We are also curious about other NoSQL solutions like couchDB, HBase, etc. which seems to be in the architecture of big apps like hipchat or line, but it seems not many projects are using them in the very beginning and the support for them in Heroku is not the same. Should we look at them if we are planning to have a big growth?
Thats the first part of our headaches, the other part is how important is to use Tornado for the messaging part of our app if we are already using Pusher. And if we do so, what could be a possible approach to combine both apps. If the Tornado app receives and storage the messages: how can we access to this messages from the Django model layer to perform searchs and so on?, can the Tornado app storage the messages in a database that is shared with the Django app?
Related questions for this:
What are the advantages (or needs) of using tornado with Pusher for a Django application?
https://stackoverflow.com/questions/22170823/sync-web-server-vs-async-web-server-on-python
And finally: how can Celery help? can we stick on Django and use Celery to queue the messages so they are asynchronously delivered to Pusher?
We would be really thankful if you could shed some light on this. We have reasearched quite a lot this week and still nothing really clear! It would be nice to know if we could start from the most simple and make some kind of progression: storing chat history in our PostgreSQL with Django, and then moving to Redis for a cache of the most recent messages, and then maybe integrating Celery and so on. Or if we should go ahead and Implement a Tornado App to handle everything related with messaging from now on!