I have a thesis paper that focuses on using Bilingual LDA and a modified version (modified for runtime) of K-Means for Sentiment Analysis (using Multinomial NB) on Filipino and English COVID-19 Tweets.
I have the files that came from my Bi-LDA from https://github.com/1991wzc/python-LDA-and-BiLDA/blob/master/BiLDA.py which made text files like theta values, phi values, topic of words and a wordmap (photo of the files attached) ![Bi-LDA outputs] (https://i.stack.imgur.com/7On0l.png) from the tweets that were tokenized and lemmatized, basically preprocessed, however, I cannot seem to apply K-Means that came from my Bi-LDA files since I do not know what to do next.
I will also attach the Google Colab of the .ipynb file so you can see what is needed to be put for K-Means: https://colab.research.google.com/drive/1FE4WkG-cEe1SPmFm49Z6ovA7oREg17VT?usp=sharing
Thank you so much, a little help will surely make a difference for me and my group.
What I did before running the Bi-LDA algorithm is to change the value of K which is equal to the value of K in my K-Means so that it only needs six (6) values once each topic is clustered.
I do not know what to expect, really since I don't know which values are to be put in a K-Means algorithm, but I am expecting after K-Means, I can get to statistical treatments.