Questions tagged [language-model]

266 questions
2
votes
1 answer

Spacy Dutch noun_phrases returns empty list using nl_core_news_sm

I want to extract the noun_phrases of a Dutch text using the model nl_core_news_sm by spacy. It returns an empty list On the other hand the equivalent English model en_core_web_sm provides indeed the list of noun_chunks (noun_phrases) Is this normal…
JFerro
  • 3,203
  • 7
  • 35
  • 88
2
votes
0 answers

Why word-level language model should help in beam search decoding in ASR?

I was experimenting with beam search decoding of an acoustic model trained with CTC loss trained on an automatic speech recognition task. The version I was using was based on this paper. However, even though many sources describe integration of…
2
votes
1 answer

padding and attention mask does not work as intended in batch input in GPT language model

The following code is without batch: from transformers import GPT2LMHeadModel, GPT2Tokenizer import torch tokenizer = GPT2Tokenizer.from_pretrained("gpt2") model =…
user3363813
  • 567
  • 1
  • 5
  • 19
2
votes
2 answers

Size of the training data of GPT2-XL pre-trained model

In huggingface transformer, it is possible to use the pre-trained GPT2-XL language model. But I don't find, on which dataset it is trained? Is it the same trained model which OpenAI used for their paper (trained on 40GB dataset called webtext) ?
user3363813
  • 567
  • 1
  • 5
  • 19
2
votes
1 answer

Pre-training BERT/RoBERTa language model using domain text, how long it gonna take estimately? which is faster?

I want to pre-train BERT and RoBERTa MLM using domain corpus (sentiment-related text). How long it gonna take for using 50k~100k words. Since RoBERTa is not trained on predicting the next sentence objective, one training objective less than BERT and…
2
votes
2 answers

spaCy can't load model ONLY when calling "rasa train"

I'm training a rasa model via command line but spaCy seems to be unable to load my language model pt_core_news_sm only when I try to train via terminal. Everything is done inside my venv and executed as admin; A may load the model when calling spaCy…
Alisson Correa
  • 357
  • 2
  • 8
2
votes
1 answer

How to get words from output of XLNet using Transformers library

I am using Hugging Face's Transformer library to work with different NLP models. Following code does masking with XLNet. It outputs a tensor with numbers. How do I convert the output to words again? import torch from transformers import…
2
votes
1 answer

Ngrams from Tensorflow TextLineDataset

I have a text file containing one sentence per line When I create a TextLineDataset and iterate on it with an iterator it returns the file line by line I want to iterate through my file two tokens at a time, here's my current code: sentences =…
Valentin Macé
  • 1,150
  • 1
  • 10
  • 25
2
votes
0 answers

How to properly initialize the hidden state at first time step of a LSTM decoder in Keras

I am currently implementing the attr2seq model as described in this paper by Dong et al. (2018) in Keras and I got completely stuck at initializing the hidden vectors at first time step of the LSTM decoder using the encoded attribute vectors $a$…
2
votes
1 answer

Calculate perplexity of word2vec model

I trained Gensim W2V model on 500K sentences (around 60K) words and I want to calculate the perplexity. What will be the best way to do so? for 60K words, how can I check what will be a proper amount of data? Thanks
oren_isp
  • 729
  • 1
  • 7
  • 22
2
votes
1 answer

get next word from bigram model on max probability

I want to generate sonnets using nltk with bigrams. I have generated bigrams and computed probability of each bigram and stored in default dict like that. [('"Let', defaultdict(.. at0x1a17f98bf8>, {'the':…
shahid hamdam
  • 751
  • 1
  • 10
  • 24
2
votes
1 answer

In the context of recurrent neural networks, what is the meaning of 'conditioned on something'?

In recurrent neural networks (RNN), for example in the paper: Sequence to Sequence Learning with Neural Networks, it says that RNN language model is conditioned on the input sequence on line 7 in paragraph 3 in the Introduction. So, what is the…
2
votes
1 answer

do searching in a very big ARPA file in a very short time in java

I have an ARPA file which is almost 1 GB. I have to do searching in it in less than 1 minute. I have searched a lot, but I have not found the suitable answer yet. I think I do not have to read the whole file. I just have to jump to a specific line…
sepanta
  • 27
  • 2
2
votes
1 answer

any way to combine 2 ngram language model into 1?

I has 2 ngram language model (model_A and model_B) now. they are trained based on differenct corpus, so the vocabulary is different they are smoothed with backoff, stored in ARPA format, so I have 2 ARPA files, ARPA_A and ARPA_B. Now if I want to…
kakamilan
  • 21
  • 2
2
votes
1 answer

How to predict word using trained CBOW

I have a question about CBOW prediction. Suppose my job is to use 3 surrounding words w(t-3), w(t-2), w(t-1)as input to predict one target word w(t). Once the model is trained and I want to predict a missing word after a sentence. Does this model…
Bratt Swan
  • 1,068
  • 3
  • 16
  • 28