I'm doing sentiment analysis of Spanish tweets.
After reviewing some of the recent literature, I've seen that there's been a most recent effort to train a RoBERTa model exclusively on Spanish text (roberta-base-bne
). It seems to perform better than the current state-of-the-art model for Spanish language modeling so far, BETO.
The RoBERTa model has been trained for a variety of tasks, which do not include text classification. I want to take this RoBERTa model and fine-tune it for text classification, more specifically, sentiment analysis.
I've done all the preprocessing and created the dataset objects, and want to natively train the model.
Code
# Training with native TensorFlow
from transformers import TFRobertaForSequenceClassification
model = TFRobertaForSequenceClassification.from_pretrained("BSC-TeMU/roberta-base-bne")
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)
model.compile(optimizer=optimizer, loss=model.compute_loss) # can also use any keras loss fn
model.fit(train_dataset.shuffle(1000).batch(16), epochs=3, batch_size=16)
Question
My questions is regarding the TFRobertaForSequenceClassification
:
Is it correct to use this, since it's not specified in the model card? Instead of the AutoModelForMaskedLM
specified in the model card.
Do we, by simply applying TFRobertaForSequenceClassification
, imply that it will automatically apply the trained (and pretrained) knowledge to the new task, namely text classification?