Questions tagged [language-model]
266 questions
2
votes
1 answer
Spacy Dutch noun_phrases returns empty list using nl_core_news_sm
I want to extract the noun_phrases of a Dutch text using the model nl_core_news_sm by spacy.
It returns an empty list
On the other hand the equivalent English model en_core_web_sm provides indeed the list of noun_chunks (noun_phrases)
Is this normal…

JFerro
- 3,203
- 7
- 35
- 88
2
votes
0 answers
Why word-level language model should help in beam search decoding in ASR?
I was experimenting with beam search decoding of an acoustic model trained with CTC loss trained on an automatic speech recognition task. The version I was using was based on this paper.
However, even though many sources describe integration of…

JAV
- 279
- 2
- 9
2
votes
1 answer
padding and attention mask does not work as intended in batch input in GPT language model
The following code is without batch:
from transformers import GPT2LMHeadModel, GPT2Tokenizer
import torch
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model =…

user3363813
- 567
- 1
- 5
- 19
2
votes
2 answers
Size of the training data of GPT2-XL pre-trained model
In huggingface transformer, it is possible to use the pre-trained GPT2-XL language model. But I don't find, on which dataset it is trained? Is it the same trained model which OpenAI used for their paper (trained on 40GB dataset called webtext) ?

user3363813
- 567
- 1
- 5
- 19
2
votes
1 answer
Pre-training BERT/RoBERTa language model using domain text, how long it gonna take estimately? which is faster?
I want to pre-train BERT and RoBERTa MLM using domain corpus (sentiment-related text). How long it gonna take for using 50k~100k words. Since RoBERTa is not trained on predicting the next sentence objective, one training objective less than BERT and…

Cass Zhao
- 43
- 4
2
votes
2 answers
spaCy can't load model ONLY when calling "rasa train"
I'm training a rasa model via command line but spaCy seems to be unable to load my language model pt_core_news_sm only when I try to train via terminal.
Everything is done inside my venv and executed as admin;
A may load the model when calling spaCy…

Alisson Correa
- 357
- 2
- 8
2
votes
1 answer
How to get words from output of XLNet using Transformers library
I am using Hugging Face's Transformer library to work with different NLP models. Following code does masking with XLNet. It outputs a tensor with numbers. How do I convert the output to words again?
import torch
from transformers import…

Parth mehta
- 1,468
- 2
- 23
- 33
2
votes
1 answer
Ngrams from Tensorflow TextLineDataset
I have a text file containing one sentence per line
When I create a TextLineDataset and iterate on it with an iterator it returns the file line by line
I want to iterate through my file two tokens at a time, here's my current code:
sentences =…

Valentin Macé
- 1,150
- 1
- 10
- 25
2
votes
0 answers
How to properly initialize the hidden state at first time step of a LSTM decoder in Keras
I am currently implementing the attr2seq model as described in this paper by Dong et al. (2018) in Keras and I got completely stuck at initializing the hidden vectors at first time step of the LSTM decoder using the encoded attribute vectors $a$…

user2566415
- 85
- 8
2
votes
1 answer
Calculate perplexity of word2vec model
I trained Gensim W2V model on 500K sentences (around 60K) words and I want to calculate the perplexity.
What will be the best way to do so?
for 60K words, how can I check what will be a proper amount of data?
Thanks

oren_isp
- 729
- 1
- 7
- 22
2
votes
1 answer
get next word from bigram model on max probability
I want to generate sonnets using nltk with bigrams. I have generated bigrams and computed probability of each bigram and stored in default dict like that.
[('"Let', defaultdict(.. at0x1a17f98bf8>,
{'the':…

shahid hamdam
- 751
- 1
- 10
- 24
2
votes
1 answer
In the context of recurrent neural networks, what is the meaning of 'conditioned on something'?
In recurrent neural networks (RNN), for example in the paper: Sequence to Sequence Learning with Neural Networks, it says that RNN language model is conditioned on the input sequence on line 7 in paragraph 3 in the Introduction.
So, what is the…

maheshkumar
- 395
- 2
- 5
- 14
2
votes
1 answer
do searching in a very big ARPA file in a very short time in java
I have an ARPA file which is almost 1 GB. I have to do searching in it in less than 1 minute. I have searched a lot, but I have not found the suitable answer yet. I think I do not have to read the whole file. I just have to jump to a specific line…

sepanta
- 27
- 2
2
votes
1 answer
any way to combine 2 ngram language model into 1?
I has 2 ngram language model (model_A and model_B) now.
they are trained based on differenct corpus, so the vocabulary is different
they are smoothed with backoff, stored in ARPA format, so I have 2 ARPA files, ARPA_A and ARPA_B.
Now if I want to…

kakamilan
- 21
- 2
2
votes
1 answer
How to predict word using trained CBOW
I have a question about CBOW prediction. Suppose my job is to use 3 surrounding words w(t-3), w(t-2), w(t-1)as input to predict one target word w(t). Once the model is trained and I want to predict a missing word after a sentence. Does this model…

Bratt Swan
- 1,068
- 3
- 16
- 28