Questions tagged [roberta-language-model]

64 questions
1
vote
1 answer

IndexError: index out of range in self while try to fine tune Roberta model after adding special tokens

I am trying to fine tune a Roberta model after adding some special tokens to its tokenizer: special_tokens_dict = {'additional_special_tokens': ['[Tok1]','[Tok2]']} tokenizer.add_special_tokens(special_tokens_dict) I get this error when i…
1
vote
2 answers

Error when loading torch.hub.load('pytorch/fairseq', 'roberta.large.mnli') on AWS EC2

I'm trying to run some code using Torch (and Roberta language model) on an EC2 instance on AWS. The compilation seems to fail, does anyone have a pointer to fix? Confirm that Torch is correctly installed import torch a = torch.rand(5,3) print…
Ahmet
  • 802
  • 1
  • 5
  • 18
1
vote
1 answer

What is the right input / shape for training a pretrained RoBERTa?

Right now I am trying to train/finetune a pretrained RoBERTa model with a multichoice head, but I am having difficulty finding the right input so my model is able to train/finetune. The dataframe I have right now looks like this: With the 3 options…
Sam V
  • 479
  • 1
  • 4
  • 11
1
vote
1 answer

How to give a list of integers as input in Tensorflow Dataset?

We are trying to finetune/train a pretrained RoBERTa model using tensorflow. For this we have to create a tf.data.Dataset from our dataframe. The dataframe looks like this: Where the three options are encoded strings, and the answer is an integer…
Sam V
  • 479
  • 1
  • 4
  • 11
1
vote
0 answers

Using WordPiece tokenization with RoBERTa

As far as I understood, the RoBERTa model implemented by the huggingface library, uses BPE tokenizer. Here is the link for the documentation: RoBERTa has the same architecture as BERT, but uses a byte-level BPE as a tokenizer (same as GPT-2) and…
1
vote
2 answers

Segmentation fault error in importing sentence_transformers in Azure Machine Learning Service Nvidia Compute

I would like to use sentence_transformers in AML to run XLM-Roberta model for sentence embedding. I have a script in which I import sentence_transformers: from sentence_transformers import SentenceTransformer Once I run my AML pipeline, the run…
1
vote
0 answers

Load Roberta model with all weights

I load the Roberta model by TFRobertaModel.frompretrained('Roberta-base') and train it using Keras. I have other layers on top of the Roberta and I need to initialize the bare Roberta with all parameters. I run my code on Colab, and since a few…
1
vote
1 answer

What makes BertGeneration and/or RobertaForCausalLM causal models? Where does the causal attention masking happen?

I am trying to use RobertaForCausalLM and/or BertGeneration for causal language modelling / next-word-prediction / left-to-right prediction. I can't seem to figure out where the causal masking is happening? I want to train teacher forcing with the…
1
vote
0 answers

colab : model = torch.hub.load('pytorch/fairseq', 'roberta.large') error

I tried the code below. import torch model = torch.hub.load('pytorch/fairseq', 'roberta.large') I got the error below. I searched tutorial colab for RoBERTa already, however, it does not work…
Pooh
  • 71
  • 1
  • 5
1
vote
1 answer

How can I check the loss when training RoBERTa in huggingface/transformers?

I trained a RoBERTa model from scratch using transformers, but I can't check the training loss during training using https://colab.research.google.com/github/huggingface/blog/blob/master/notebooks/01_how_to_train.ipynb In the notebook, loss is…
0
votes
0 answers

Facing issues when loading from checkpoint in Pytorch Lightning

Im trying to use the below methodology to load my checkpoint however, it throws isADirectoryError when I passed ckpt path as shown in documentation. Here's my code. def main(): # init lightning classes …
0
votes
0 answers

Initialize masked language model with RobertaForMaskLM missing gelu activation layer

i am trying to train a masked language model from scratch. i use below code to create the Roberta model architecture. but when I compare it with RobertaLM, I found it does not have the GELU activation layer. could someone help explain how to…
0
votes
1 answer

Loading local tokenizer

I'm trying to load a local tokenizer using; from transformers import RobertaTokenizerFast tokenizer = RobertaTokenizerFast.from_pretrained(r'file path\tokenizer') however, this gives me the following error; OSError: Can't load tokenizer for 'file…
0
votes
0 answers

SimpleTransformers I am getting the same result using models RoBERTa and BERT

I am new to simpletransformers and NLP in general, I am currently working on a project in which I need to collect a number of values based on a model that I train from the simple transformer's library. Currently, the models I am working with are…
0
votes
1 answer

Getting an embedded output from huggingface transformers

To compare different paragraphs, I am trying to use a transformer model, fitting each paragraph onto the model and then in the end I intend to compare the outputs and see which paragraph has the most similarity. For this purpose, I am using…