Questions tagged [roberta-language-model]
64 questions
1
vote
1 answer
Fine-tuned MLM based RoBERTa not improving performance
We have lots of domain-specific data (200M+ data points, each document having ~100 to ~500 words) and we wanted to have a domain-specific LM.
We took some sample data points (2M+) & fine-tuned RoBERTa-base (using HF-Transformer) using the Mask…

Kalsi
- 579
- 5
- 13
1
vote
0 answers
PyTorch CUDA Out Of Memory error when running multiple passes of inference
The issue
I am trying to run inference using a sentence-transformers model on all rows of the scientific_papers/pubmed dataset.
After 177 iterations of the attached code, I get the following error:
torch.cuda.OutOfMemoryError: CUDA out of memory.…

Aldan Creo
- 686
- 1
- 5
- 14
1
vote
0 answers
Using TPU on the Huggingface Pipeline throws PyTorch error
I used to run this script using GPUs on GCP, but I am now trying to implement it using TPUs. As far as I am concerned, TPUs should now be working fine with the transformers pipeline.
However, trying to set the device parameter throws RuntimeError:…

DarknessPlusPlus
- 543
- 1
- 5
- 18
1
vote
0 answers
RoBERTa tokenizer issue for certain characters
I am using the RobertaTokenizerFast to tokenize some sentences and align them with annotations. I noticed an issue with some chatacters
from transformers import BatchEncoding, RobertaTokenizerFast
from tokenizers import Encoding
tokenizer =…

Paschalis
- 191
- 10
1
vote
1 answer
BERT high test accuracy but bad predictions on new data
I have a very interesting problem. I use xlm-roberta model for multilabel text classification and I use 1s and 0s for the labels. I get 5 months text data from database, do train validation test split and get %86 accuracy on the test data.…

İhsan Dağ
- 29
- 1
- 5
1
vote
0 answers
How to chose the right LineByLineTextDataset-parameters for the transformers LineByLineTextDataset
From this website explaining the Roberta parameters, I understood that the
max_position_embeddings should be a power of 2.
Then from this GitHub issue, I understood that we should add 2 to the max_position_embeddings value while setting the…

Kyv
- 615
- 6
- 26
1
vote
0 answers
Roberta with multi-class and OneHotEncoding
I have a dataset for fake news, which has 4 different classes: true, false, partially true, other.
Currently my code uses LabelEncoding to these labels but I would like to switch to OneHot Encoding.
So no I am trying to turn these labels into OneHot…

Kristian Zhelyazkov
- 15
- 4
1
vote
1 answer
Load pytorch model with correct args from files
Having followed Chris McCormick's tutorial for creating a BERT Fake News Detector (link here), at the end he saves the PyTorch model using the following code:
output_dir = './model_save/'
if not os.path.exists(output_dir):
…

You_Donut
- 155
- 8
1
vote
1 answer
BERTweet throws an error when input is relatively long
I am using the hugging face's BERTweet implementation (https://huggingface.co/docs/transformers/model_doc/bertweet), I want to encode some tweets and forward them for further processing (predictions). The problems is that when I try to encode a…

Petar
- 11
- 1
1
vote
0 answers
How to present in a fairly simple manner how RoBERTa aquires new knowledge about a downstream task
My thesis defense is coming up next week, and I wanted to have your take on an issue I'm currently facing. One of my thesis contributions is "Adapting RoBERTa to the task of rumor detection on Twitter"
I want to explain to the jury how RoBERTa can…

Hamda Slimi
- 45
- 4
1
vote
0 answers
Want to fine tune pretrained RoBERTa, from Huggingface, on my own data for text summarization
I am a beginner in this. Please help me in getting a solution.
I have used the RobertaTokenizerFast to tokenize the text and summary (max_token_length 200 and 50 respectively).
The plan is to use RoBERTa as the first layer. Then condense its output…

rana
- 69
- 1
- 8
1
vote
2 answers
_batch_encode_plus() got an unexpected keyword argument 'return_attention_masks'
I am studying RoBERTA model to detect emotions in tweets.
On Google colab. Following this Noteboook file from Kaggle - https://www.kaggle.com/ishivinal/tweet-emotions-analysis-using-lstm-glove-roberta?scriptVersionId=38608295
Code snippet:
def…

Shubhasmita Roy
- 39
- 1
- 7
1
vote
1 answer
Why BPE encoding trained on English and applied on Bengali doesnot return unknown tokens?
I use roberta-base tokenizer tokenizer = RobertaTokenizerFast.from_pretrained('roberta-base',add_prefix_space=True) trained on english data to tokenize bengali just to see how it behaves . When I try to to encode a bengali character…

Soumya
- 87
- 1
- 2
- 15
1
vote
1 answer
Using roberta model cannot define the model .compile or summary
Using roberta model for sentiment analysis cannot define the model .compile or summary
from transformers import RobertaTokenizer, RobertaForSequenceClassification
from transformers import BertConfig
tokenizer =…

Sherouk Adel
- 13
- 2
1
vote
0 answers
How to use fine-tuned bert model in other tasks?
I fine-tuned a bert (or roberta) model for sequence classification. Can I fine-tune the same model for a different task (QA or Sentiment Analysis)?

Nilou
- 145
- 2
- 10