0

im pretty new to the whole sentiment analysis Cnn. So ive been looking at github codes just to get a better understanding. I found an online respiratory and tried to see how it would work with my own dataset. However every time i restart my jupyter notebook I receive completely different results each time.

The code i used is based on

https://github.com/tthustla/twitter_sentiment_analysis_part11/blob/master/Capstone_part11.ipynb

  • If when you restart your notebook you are also re-executing every cell, it is normal because the embeddings and the CNN model are retrained. Both the embeddings and the CNN are randomly initialized. Thus retraining them yields different results each time. But the results should not be that different between runs. What do you mean by "completely different results" ? – ygorg Apr 01 '21 at 15:40
  • Well the accuracy, f1 score, precision recall might be 2-4% different each time –  Apr 01 '21 at 16:59
  • That level of jitter between results is not surprising. Some metaparameter tweaks might tighten the range – such as more training epochs – but as both the Word2Vec & CNN steps integrate randomization, no 2 runs should be expected to be deterministically identical. Separately: that notebook's loop that calls Word2Vec.train() many times, & manages `alpha` itself, is an antipattern. See: https://stackoverflow.com/a/62801053/130288 - applies to Word2Vec as awell as Doc2Vec. It's not likely a major source of your jitter, but it's still unnecessary, overcomplicated, & error-prone. – gojomo Apr 01 '21 at 18:40

0 Answers0