Questions tagged [roberta-language-model]
64 questions
0
votes
1 answer
Token indices sequence length warning while using pretrained Roberta model for sentiment analysis
I am presently using a pretrained Roberta model to identify the sentiment scores and categories for my dataset. I am truncating the length to 512 but I still get the warning. What is going wrong here? I am using the following code to achieve…

Django0602
- 797
- 7
- 26
0
votes
1 answer
When fine-tuning RoBERTa model to add specific domain knowledge, what is overall process?
Adding token about domain to tokenizer and fine-tuning is both essential?
a. Is it right process to adding domain token to tokenizer before fine-tuning model?
b. If I just adding domain token without fine-tuning, it could be improve in…

theorem
- 1
0
votes
0 answers
Roberta on local CPU tensor mismatch at non-singleton dimension 1
I downloaded https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment model to my local pc.
When I pull the model from the website it works perfectly fine but it gave me tensor mismatch error on local.
`self.MODEL =…

zangel
- 1
- 1
0
votes
0 answers
Text Classification roBERTA word embedding to CNN, accuracy not improving
I am learning CNN and roBERTa word embedding, I create a sentiment analysis with 3 label, -1 for negative, 0 for neutral, 1 for positive. I already have word embedding from roBERTa but when processing with CNN, the accuracy just stay at 30-34 % how…
0
votes
0 answers
Using a BertTokenizer when training a RobertaForMaskedLM Hugginface
I want to train from scrath a RobertaForMaskedLM model. But I need a character level tokenizer and I found one already, perfect for me. So I am wondering,
can I re-use it?
This is my RobertaForMaskedLM config:
{"architectures":…

Chiara
- 372
- 5
- 17
0
votes
0 answers
Is there a way to tell if the token is from what language?
I'm usig XLMR from hugging face.and I need to do some token filtration.is there a way to tell if the token is from a specific language?
for example tokens form id 50 - 500 are English tokens, and from 800- 1200 are Arabic.
I think I can use another…

Faisal Hejary
- 3
- 1
0
votes
1 answer
xlm-roberta tokenizer sticks all words together
I am trying to use a xlm-roberta model I have fine-tuned for token classification, but no matter what I do, I always get as an output all tokens stuck together, like:
[{'entity_group': 'LABEL_0',
'score': 0.4824247,
'word':…

chancar
- 3
- 2
0
votes
0 answers
Pre-trained Language Models: Parameters, data, method?
I am doing a research on pre-trained LMs, specifically the following LMs:
BERT
ALBERT
RoBERTa
XLNet
DistilBERT
BigBird
ConvBERT
I am looking for information to compare these LMs like: number of parameters, layers, data on which they were…

Othman El houfi
- 53
- 3
- 9
0
votes
0 answers
ValueError: `Checkpoint` was expecting model to be a trackable object (an object derived from `Trackable`), got RobertaForSequenceClassification
I am training a text classification model using RoBERTa. (https://huggingface.co/siebert/sentiment-roberta-large-english)
I use google colab for running the code.
I am facing a valueError i have trid googling about this error and i have not gotten a…
0
votes
0 answers
How to fine-tune Allennlp's RoBERTa text entailment model on custom data?
I'm working on a project where I need to fine-tune pair-classification-roberta-snli model offered by AllenNLP. I have prepared my custom dataset in the snli format but couldn't manage to find a way to retrain the model. Currently, I am following…

Sujan Dutta
- 1
- 1
0
votes
1 answer
Plot Confusion Matrix from Roberta Model
I wrote the text classification code with two classes using the Roberta model and now I want to draw the confusion matrix.
How to go about plotting the confusion matrix based of a Roberta model?
RobertaTokenizer =…

fateme shamshiri
- 69
- 7
0
votes
1 answer
Training roberta model on imdb movie reviews dataset giving this error?
def convert_data_to_examples(train, test, review, sentiment):
train_InputExamples = train.apply(lambda x: InputExample(guid=None, # Globally unique ID for bookkeeping, unused in this case
…

noman khan
- 1
- 1
0
votes
1 answer
Why is my tensorflow Roberta Model unable to train/finetune?
We are trying to finetune / train our RoBERTa model on our own train data. The project is exactly the same as the SemEval-2020 task B on choosing the right reason out of 3 on why a sentence is against common sense. For the past two days we have been…

Sam V
- 479
- 1
- 4
- 11
0
votes
1 answer
In what program/interface do I run the following code?
For a project on machine learning/NLP I am looking at some code from github on roBERTa.
I wanted to see if I could get the same results and then modify the program to fit my own data.
However, I have no idea on how/where/using what program to run…

Sam V
- 479
- 1
- 4
- 11
0
votes
2 answers
API to serve roberta ClassificationModel with FastAPI
I have trained transformer using simpletransformers model on colab ,downloaded the searialized model and i have little issues on using it to make inferences.
Loading the model on model on jupyter works but while using it with fastapi gives an…

TheoAlpha
- 1
- 1