Highest Voted 'language-model' Questions

2

votes

1 answer

How to normalize probabilities of words in varying length sentences?

Let's say we have an RNN model that outputs the probability of a word given context (or no context) trained on a corpus. We can chain the probability of each word in a sequence to get the overall probability of the sentence itself. But, because we…

asked Mar 02 '18 at 21:24

Sanjay Krishna

157
1
7

2

votes

1 answer

Tensorflow num_classes parameter of nce_loss()

My understanding of noise contrastive estimation is that we sample some vectors from our word embeddings (the negative sample), and then calculate the log-likelihood of each. Then we want to maximize the difference between the probability of the…

tensorflow nlp word2vec word-embedding language-model

asked Nov 29 '17 at 17:40

Aj Langley

127
9

2

votes

0 answers

Perplexity calculation for Language Model on 1 Billion Word Language Model Benchmark

Recently, I have been trying to implement RNNLM based on this article. There is an implementation with some LSTM factorization tricks, but similar to the original implementation by the author. Preambula 1) The dataset is split into files and then…

neural-network deep-learning lstm recurrent-neural-network language-model

asked Jul 13 '17 at 16:36

fminkin

162
2
10

2

votes

1 answer

TensorFlow: loss jumps up after restoring RNN net

Environment info Operating System: Windows 7 64-bit Tensorflow installed from pre-built pip (no CUDA): 1.0.1 Python 3.5.2 64-bit Problem I have problems with restoring my net (RNN character base language model). Below is a simplified version with…

tensorflow recurrent-neural-network language-model

asked Apr 26 '17 at 23:36

tmv

41
7

2

votes

0 answers

When loading KenLM language model for scoring sentences should the LM file size be less than RAM size?

When loading language model for scoring sentence should the LM('bible.klm') filesize be less than RAM size? import kenlm model = kenlm.LanguageModel('bible.klm') model.score('in the beginning was the word')

memory nlp language-model kenlm

asked Apr 18 '17 at 07:48

Arshiyan Alam

335
1
11

2

votes

1 answer

Reason for eval_config setting parameters to 1 in ptb_word_lm.py

While examining the setting for evaluation in Tensorflow's PTB language model, I am perplexed by this setting for the evaluation in eval_config: eval_config = get_config() eval_config.batch_size = 1 eval_config.num_steps = 1 in…

tensorflow neural-network nlp language-model

asked Mar 21 '17 at 20:15

Sayan Ghosh

31
2

2

votes

1 answer

nltk.KneserNeyProbDist is giving 0.25 probability distribution for most of the trigrams

I am working on Language Modeling using nltk I am using this essay as my corpus in mypet.txt file. I am getting 0.25 Kneser Ney probability distribution for most of the trigrams. I don't know why. Is it right? Why is it doing so? This is my…

python nltk language-model trigram

asked Dec 08 '16 at 12:42

Jai Prak

2,855
4
29
37

2

votes

1 answer

Word prediction : neural net versus n-gram approach

For example if I attempt to predict the next word in a sentence I can use a bi gram approach and compute the probabilities of a word occurring based on the previous word in the corpus. If instead I use a neural net to predict the next word. The…

nlp neural-network language-model

asked Sep 26 '16 at 17:09

blue-sky

51,962
152
427
752

2

votes

1 answer

Raise MemoryError when I am fitting a sequence to sequence LSTM using Keras+Theano

I was trying to implement a sequence to sequence language model. During training process, the model takes in a sequence of 50d word vectors generated by GloVe, and output 1-to-V(V is the size of vocabulary) vector meaning the next word which thus…

python keras lstm language-model

asked Sep 08 '16 at 11:21

高剑飞

21
1

2

votes

1 answer

What is the softmax_w and softmax_b in this document?

I'm new to TensorFlow and need to train a language model but run into some difficulties while reading the document as shown bellow. lstm = rnn_cell.BasicLSTMCell(lstm_size) # Initial state of the LSTM memory. state = tf.zeros([batch_size,…

tensorflow language-model softmax

asked Aug 19 '16 at 13:03

Lerner Zhang

6,184
2
49
66

2

votes

0 answers

How do I change the Keras text generation example from being on character level to word level?

The above code is more or less what the Keras documentation gives us as a language model. The thing is that this language model predicts characters, not words. Strictly speaking, a language model is supposed to predict full words. My question is,…

python nlp keras lstm language-model

asked Jul 05 '16 at 16:15

Cedric Oeldorf

579
1
4
10

2

votes

1 answer

How to calculate perplexity for a language model trained using keras?

Using Python 2.7 Anaconda on Windows 10 I have trained a GRU neural network to build a language model using keras: print('Build model...') model = Sequential() model.add(GRU(512, return_sequences=True, input_shape=(maxlen,…

python nlp keras language-model

asked May 07 '16 at 13:33

ishido

4,065
9
32
42

2

votes

1 answer

RNNLM using theano

I asked the same question on theano user list, but got no reply, just wondering if anyone can help me here. I am trying to re-implement the RNNLM of http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf based…

python nlp theano recurrent-neural-network language-model

asked Feb 01 '16 at 11:29

user200340

3,301
13
52
74

2

votes

1 answer

What is the next procedure after creating a CMUSphinx language model with my own dictionary?

I have created my own CMUSphinx language model for Arabic language for a software that will be listening to a user and apply commands with my own dictionary that I've done it manually by hand, converted "arpa" language model type to "dmp" language…

java dictionary cmusphinx language-model

asked Dec 28 '15 at 23:34

0x01Brain

798
2
12
28

2

votes

3 answers

Language Modelling toolkit

I would like to build a language model for a text corpus. Are there good out-of-the-box toolkits which will alleviate my task? The only toolkit I know off is the Statistical Language Modelling(SLM) Toolkit by CMU. Regards,

java python machine-learning language-model

asked Jul 21 '10 at 13:52

Dexter

11,311
11
45
61

Questions tagged [language-model]