Questions tagged [language-model]

266 questions
1
vote
1 answer

Fine tuning of Bert word embeddings

I would like to load a pre-trained Bert model and to fine-tune it and particularly the word embeddings of the model using a custom dataset. The task is to use the word embeddings of chosen words for further analysis. It is important to mention that…
Aviade
  • 2,057
  • 4
  • 27
  • 49
1
vote
0 answers

Obtaining the Probability of a Sentence using a Language Model

I have trained a language model using the following architecture, model = tf.keras.Sequential([ tf.keras.layers.Embedding(total_words, 300, weights=[embeddings_matrix], input_length=inputs.shape[1],…
Minura Punchihewa
  • 1,498
  • 1
  • 12
  • 35
1
vote
0 answers

RNN Language Model in PyTorch predicting the same three words repeatedly

I am attempting to create a word-level language model using an RNN in PyTorch. Whenever I am training the loss stays about the same for the whole training set and when I try to sample a new sentence the same three words are predicted in the same…
1
vote
2 answers

Scripts missing for GPT-2 fine tune, and inference in Hugging-face GitHub?

I am following the documentation on the hugging face website, in there they say that to fine-tune GPT-2 I should use the script run_lm_finetuning.py for fine-tuning, and the script run_generation.py for inference. However, both scripts don't…
1
vote
1 answer

How to use HuggingFace nlp library's GLUE for CoLA

I've been trying to use the HuggingFace nlp library's GLUE metric to check whether a given sentence is a grammatical English sentence. But I'm getting an error and is stuck without being able to proceed. What I've tried so far; reference and…
1
vote
1 answer

Fine tuning a pretrained language model with Simple Transformers

In his article 'Language Model Fine-Tuning For Pre-Trained Transformers' Thilina Rajapakse (https://medium.com/skilai/language-model-fine-tuning-for-pre-trained-transformers-b7262774a7ee) provides the following code snippet for fine-tuning a…
1
vote
1 answer

How does masked_lm_labels argument work in BertForMaskedLM?

from transformers import BertTokenizer, BertForMaskedLM import torch tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') model = BertForMaskedLM.from_pretrained('bert-base-uncased') input_ids = torch.tensor(tokenizer.encode("Hello, my…
1
vote
1 answer

Define and Use new smoothing method in nltk language models

I'm trying to provide and test new smoothing method for language models. I'm using nltk tools and don't want to redefine everything from scratch. So is there any way to define and use my own smoothing method in nltk models? Edit: I'm trying to do…
Behzad Shayegh
  • 323
  • 1
  • 10
1
vote
0 answers

Trainable USE-lite-based classifier with SentencePiece input

I have heard that it is possible to use the pretrained Universal Sentence Encoder (USE) (neural language model) from TF-hub as part of a trainable model, e.g. a sentence classifier. Some versions of USE rely on SentencePiece sub-word tokenizer,…
tpacker
  • 11
  • 3
1
vote
1 answer

GPT-2 language model: multiplying decoder-transformer output with token embedding or another weight matrix

I was reading the code of GPT2 language model. The transformation of hidden states to the probability distribution over the vocabulary has done in the following line: lm_logits = self.lm_head(hidden_states) Here, self.lm_head =…
1
vote
1 answer

while running huggingface gpt2-xl model embedding index getting out of range

I am trying to run hugginface gpt2-xl model. I ran code from the quickstart page that load the small gpt2 model and generate text by the following code: from transformers import GPT2LMHeadModel, GPT2Tokenizer import torch tokenizer =…
user3363813
  • 567
  • 1
  • 5
  • 19
1
vote
0 answers

Word2vec: what does it mean that the projection layer is shared?

The slides of my professor compare the "Neural Net Language Model" (Bengio et al., 2003) with Google's word2vec (Mikolov et al., 2013). It says that, differently from the Bengio's model, in word2vec "the projection layer is shared (not just the…
robertspierre
  • 3,218
  • 2
  • 31
  • 46
1
vote
0 answers

Next word Prediction RNN

this is my second post. I am really sorry If I sound awkward. I am new to Machine Learning. Reporting the question or giving a negative point will help me. Again I am sorry for unable to clear my question. Now coming to my question, I am working on…
1
vote
1 answer

How to deal with large vocab_size when training a Language Model in Keras?

I want to train a language model in Keras, by this tutorial: https://machinelearningmastery.com/develop-word-based-neural-language-models-python-keras/ My input is composed of: lines num: 4823744 maximum line: 20 Vocabulary Size: 790609 Total…
jonb
  • 845
  • 1
  • 13
  • 36
1
vote
1 answer

How does Ulmfit's language model work when applied on a text classification problem?

I have been playing around with Ulmfit a lot lately and still cannot wrap my head around how the language model’s ability to make sound predictions about the next word affects the classification of texts. I guess my real problem is that I do not…
BigHead
  • 53
  • 3