0

We have 4 applications: A, B, C, and D. Applications are scaping different social-network data from different sources. Each application has its own database.

Application A scrapes eg. Instagram accounts, Instagram posts, Instagram stories - from external X source.  Applications B scrapes eg. Instagram account follower and following COUNT history - from external source Y.  Application C scrapes eg. Instagram account audience data (eg. gender statistic: male vs female, age statistic, country statistic, etc) - from external source Z. Application D scrapes TikTok data from external source W.

Our data analytics team has to create different kinds of analysis: 

  • eg. data (table) that have Instagram post engagement (likes + post / total number of followers for that month) for specific Instagram accounts. 
  • eg. Instagram account development - total number of followers per month, the total number of posts per month, average post engagement per month, etc...
  • eg. account follower insights - we are analyzing just pieces of Instagram account followers eg. 5000 of them 1000000. We analyze who our followers follow beside us. Top 10 followings. 
  • lot of other similar kind of reports

Right now we have 3TB of data in our OLTP Postgres DB, and it is not a solution for us anymore. - We are running really heavy queries for reporting, BI... and we want to move social-network data to Data Warehouse or Open Search.

We are on AWS and we want to use Redshift or Open Search for our data analysis.  We don't need Real Time processing. What is the better solution for us, Redshift or OpenSearch? Any ideas are welcome.  

I expect to have infrastructure that will be able to run heavy queries for data analytics team for reporting and BI.

0 Answers0