My thesis defense is coming up next week, and I wanted to have your take on an issue I'm currently facing. One of my thesis contributions is "Adapting RoBERTa to the task of rumor detection on Twitter"
I want to explain to the jury how RoBERTa can adjust its weights based on the dataset that I fine-tune it on.
In simple terms, I fed RoBERTa a variety of datasets describing the task of "Rumor detection on Twitter" while altering the class distribution in the datasets to see how it influences the embedding that RoBERTa produces. I evaluated the quality of the embeddings by feeding them to a set of classifiers (Random Forest, Decision Tree, SVM) to see how they perform. I used standard metrics (Precision-Recall and F1-score) focusing on the model performance in recognizing the class rumor. I was considering explaining it this way: RoBERTa takes in a tweet with a label (Rumor/non-rumor), then it weighs the words and their impact on the class in question. And words that occur often in a class are the ones that are potentially correlated to it. But I feel like that's too much watering down and even an insult to the intricacy involved in RoBERTa's inner workings. So for you out there with much more knowledge and expertise than me, will you please indulge my request and enlighten me on How one can explain the details of fine-tuning pre-trained language models on a downstream task.