Questions tagged [language-model]

266 questions

vote

2 answers

Get the probability distribution of next word given a sequence using TensorFlow's RNN (LSTM) language model?

I'm running TensorFlow's RNN (LSTM) language model example here. It runs and reports the perplexities perfectly. What I want though is three things: Given a sequence (e.g. w1 w5 w2000 w750) give me the probability distribution for the next word…

tensorflow lstm language-model

asked Aug 31 '16 at 06:47

Ash

3,428
1
34
44

vote

1 answer

How to learn two sequences simultaenously through LSTM in Tensorflow/TFLearn?

I am learning LSTM based seq2seq model in Tensorflow platform. I can very well train a model on a given simple seq2seq examples. However, in cases where I have to learn two sequences at once from a given sequence (for e.g: learning previous…

python tensorflow deep-learning lstm language-model

asked Aug 01 '16 at 14:47

user3480922

vote

1 answer

TensorFlow reset state during batch = sentence-level language model

What is the best way to build a recurrent language model (e.g. LSTM) that does not cross sentence boundaries? Or put more general, if you present a batch to the model, each row containing multiple sentences, how can you reset the state after seeing…

tensorflow language-model

asked Jul 29 '16 at 18:55

niefpaarschoenen

vote

2 answers

Dynamic LSTM model in Tensorflow

I am looking to design a LSTM model using Tensorflow, wherein the sentences are of different length. I came across a tutorial on PTB dataset (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/models/rnn/ptb/ptb_word_lm.py). How does…

tensorflow deep-learning lstm recurrent-neural-network language-model

asked Jul 18 '16 at 20:21

user3480922

vote

0 answers

What is a simple example of a TensorFlow file pipeline for a language model?

I am building a RNN language model in TensorFlow. My raw input consists of files of text. I am able to tokenize them, so that data I am working with is sequences of integers that are indexes into a vocabulary. Following the example in…

tensorflow language-model

asked Jul 15 '16 at 00:06

W.P. McNeill

16,336
12
75
111

vote

0 answers

Using language model tool without any installation

I know that there are some language model tools which are IRSLM, MITLM, SRILM . All of them need to a installation to be able to create a language model etc. However I need a language model tool which is not needed any installation and can be used…

speech-recognition cmusphinx sphinx4 language-model

asked Jun 28 '16 at 14:41

ziLk

3,120
21
45

vote

1 answer

language model with SRILM

I'm trying to build a language model using SRILM. I have a list of phrases and I create the model using: ./ngram-count -text corpus.txt -order 3 -ukndiscount -interpolate -unk -lm corpus.lm After this I tried to make some example to see the…

nlp n-gram language-model srilm

asked Mar 31 '16 at 16:17

Daniele

vote

1 answer

Wrong number of dimensions: expected 0, got 1 with shape (1,)

I am doing word-level language modelling with a vanilla rnn, I am able to train the model but for some weird reasons I am not able to get any samples/predictions from the model; here is the relevant part of the code: train_set_x, train_set_y, voc =…

theano recurrent-neural-network language-model

asked Feb 19 '16 at 13:48

uyaseen

1,189
3
16
34

vote

2 answers

nltk language model TypeError:ngarms() got an unexpected keyword argument 'pad_symbol'

I'm executing the following code: from nltk.corpus import brown from nltk.model import Ngram lm = NgramModel(2, brown.words(categories='news'), estimator=None) But I got an error: I really don't know why I do have this problem; is it a bug from…

python nlp nltk n-gram language-model

asked Jan 28 '16 at 01:37

Am1rr3zA

7,115
18
83
125

vote

1 answer

Correct parameters for wngram2idngram?

I am trying to generate the arpa format language model with the following commands: text2wngram < weather.txt | grep -v " ~~" > weather.wngram wngram2idngram -vocab weather.vocab < weather.wngram > weather.idngram idngram2lm -vocab_type 0…~~

sphinx4 pocketsphinx language-model

asked Oct 30 '15 at 08:58
g10dras

399

2

11

1
vote

1 answer

CMU Sphinx4 - Custom Language Model

I have a very specific requirement. I am working on an application which will allow users to speak their employee number which is of the format HN56C12345 (any alphanumeric characters sequence) into the app. I have gone through the link:…

cmusphinx sphinx4 language-model

asked Oct 08 '15 at 21:44
Qedrix

453

1

8

15

1
vote

1 answer

Why is my Sphinx4 Recognition poor?

I am learning how to use Sphinx4 using the Maven plug-in for Eclipse. I took the transcribe demo found on GitHub and altered it to process a file of my own. The audio file is 16bit, mono, 16khz. It is approximately 13 seconds long. I noticed that…

eclipse speech-recognition cmusphinx sphinx4 language-model

asked Jun 23 '15 at 18:03
tmsBoston

23

3

1
vote

1 answer

Is likelihood calculated over the whole training set or a single example?

Suppose I have a training set of (x, y) pairs, where x is the input example and y is the corresponding target and y is a value (1 ... k) (k is the number of classes). When calculating the likelihood of the training set, should it be calculated for…

machine-learning probability mle language-model

asked Jun 04 '15 at 09:30
Cheshie

2,777

6

32

51

1
vote

1 answer

n-gram probability count in ARPA file

I start working on a problem related with language modelling, but some calculation does not clear to me. For example consider the following simple text: I am Sam Sam I am I do not like green eggs and ham I have used berkelylm to create the n-gram…

nlp n-gram language-model

asked Mar 19 '15 at 17:56
Muhammad Asaduzzaman

1,201

3

19

33

1
vote

0 answers

KenLM perplexity weirdness

I have 96 files each containing ~10K lines of English text (tokenized, downcased). If I loop through the files (essentially doing k-fold cross-validation with k=#files) and build a LM (using bin/lmplz) for 95 and run bin/query on the held out file…

nlp language-model

asked Dec 18 '14 at 18:01
dbl

163

1

11

Prev 1 2 3
…
17 18 Next