I want to build an app where I can enter any Twitter keywords, the backend will crawl related tweets and return sentiment analysis of the tweets in percentage of negative, neutral and positive tweets. For example, I enter the keyword 'pepsi', the app will output something like this: Tweets related to pepsi contains 10% negative sentiment, 10% neutral sentiment and 80% of positive review.
So the problem is how to train an machine learning algorithm that I can use in the backend to do such sentiment analysis on various kind of topics. The main idea involved here is transfer learning, where we train one model on large amount of labeled data and use it as baseline to train other data. Transfer learning has limits in NLP mostly because knowledge learned at one task is not broad enough to downstream to other tasks. For example, I pretrained a good neural network to do sentiment analysis on airlines with a prediction accuracy of over 70%. However, when I use the same model to do sentiment analysis on pepsi, I get only around 30% prediction accuracy.
I did some research and noticed that Google's universal sentence embedding is quite popular. However, I realized this is a new way of converting input text into feature vector, not a universal algorithm. I wonder anyone can point me to directions I should go? Thanks a lot in advance!