Highest Voted 'language-model' Questions

5

votes

3 answers

Is positional encoding necessary for transformer in language modeling?

I am developing a language model like https://pytorch.org/tutorials/beginner/transformer_tutorial.html. It is not clear for me - whether positional encoding is neccessary here ? As far as I understand - it is necessary for language translation task…

transformer-model language-model

asked Apr 26 '20 at 11:54

Andrey

5,932
3
17
35

5

votes

1 answer

Differences between en_vectors_web_lg and Glove vectors (spaCy)

https://spacy.io/models/en#en_vectors_web_lg stated that the model contains 1.1m keys, but https://nlp.stanford.edu/projects/glove/ stated that the Glove vectors contain 2.2M vocabs May I know what vocabs are missing? Thank you very much.

python spacy language-model

asked Feb 14 '18 at 04:15

hi bye

89
1
5

5

votes

1 answer

Understanding Character Level Embedding in Keras LSTM

I am a newbie in implementation of language models in Keras RNN structures. I have a dataset of discrete words (not from a single paragraph) that have the following statistics, Total word samples: 1953 Total number of Distinct Characters: 33…

python keras lstm embedding language-model

asked Jun 16 '17 at 09:57

Parthosarathi Mukherjee

385
2
13

5

votes

4 answers

How to compute perplexity using KenLM?

Let's say we build a model on this: $ wget https://gist.githubusercontent.com/alvations/1c1b388456dc3760ffb487ce950712ac/raw/86cdf7de279a2b9bceeb3adb481e42691d12fbba/something.txt $ lmplz -o 5 < something.txt > something.arpa From the perplexity…

python nlp language-model kenlm perplexity

asked May 08 '17 at 06:52

alvas

115,346
109
446
738

5

votes

0 answers

Predicting a probability of a sentence using tensorflow

I am using this pre-trained model of tensorflow and trying to get a probability of a sentence. My primary task is, out of several sentences find a sentence with the largest probability. I am able to predict next words, using this…

python-2.7 tensorflow recurrent-neural-network language-model

asked Mar 02 '17 at 11:04

Riken Shah

3,022
5
29
56

5

votes

1 answer

Train TensorFlow language model with NCE or sampled softmax

I'm adapting the TensorFlow RNN tutorial to train a language model with a NCE loss or sampled softmax, but I still want to report perplexities. However, the perplexities I get are very weird: for NCE I get several millions (terrible!) whereas for…

tensorflow lstm softmax language-model

asked Jul 14 '16 at 00:19

niefpaarschoenen

560
1
8
19

5

votes

1 answer

How to tune a Machine Translation model with huge language model?

Moses is a software to build machine translation models. And KenLM is the defacto language model software that moses uses. I have a textfile with 16GB of text and i use it to build a language model as such: bin/lmplz -o 5 text.arpa The…

nlp n-gram machine-translation moses language-model

asked Apr 25 '15 at 19:20

alvas

115,346
109
446
738

4

votes

1 answer

Difference between Instruction Tuning vs Non Instruction Tuning Large Language Models

What is the difference between instruction tuning and normal fine-tuning for large language models? Also the instruction-tuning I'm referring to isn't the in-context/prompt one. All the recent papers about fine-tuning seem to be about instruction…

language-model fine-tune large-language-model

asked Jun 11 '23 at 15:37

Flo

51
1
4

4

votes

0 answers

Keras Lstm predicting next item, taking whole sequences or sliding window. Will sliding window need stateful LSTM?

I have a sequence prediction problem in which, given the last n items in a sequence I need to predict next item. I have more than 2 million sequences each with different timesteps (length of sequence), like some are just 5 and some are…

python keras lstm language-model lstm-stateful

asked Nov 09 '20 at 11:32

A.B

20,110
3
37
71

4

votes

2 answers

When using padding in sequence models, is Keras validation accuracy valid/ reliable?

I have a group of non zero sequences with different lengths and I am using Keras LSTM to model these sequences. I use Keras Tokenizer to tokenize (tokens start from 1). In order to make sequences have the same lengths, I use padding. An example of…

tensorflow machine-learning keras deep-learning language-model

asked Jul 17 '20 at 23:47

Amir Jalilifard

2,027
5
26
38

4

votes

0 answers

squad2.0 training error: THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=50 error=100 : no CUDA-capable device is detected

!python -m torch.distributed.launch --nproc_per_node=8 /root/examples/run_squad.py \ --model_type bert \ --model_name_or_path bert-large-uncased-whole-word-masking \ --do_train \ --do_eval \ --do_lower_case \ --train_file…

python tensorflow transformer-model language-model

asked Dec 07 '19 at 02:51

TIGUZI

231
1
3
12

4

votes

1 answer

Difference between spaCy models sm, md, lg

I can see that in the English spaCy models the medium model performs better than the small one, and the large model outperforms the medium one - but only marginally. However, in the description of the models, it is written that they have all been…

spacy language-model

asked Sep 11 '19 at 08:12

Bram Vanroy

27,032
24
137
239

4

votes

0 answers

Alternative to one-hot encoding for output to a model when vocabulary size is very large

I was following this blog. In it he talks about how to build a language model in keras. He shows how to build a simple model in keras. After separating, we need to one hot encode the output word. This means converting it from an integer to a vector…

nlp keras language-model

asked May 31 '18 at 11:20

humble

2,016
4
27
36

4

votes

1 answer

How to relate the language model score of a whole sentence to those of the sentence's constituents

I trained a KENLM language model on around 5000 English sentences/paragraphs. I want to query this ARPA model with two or more segments and see if they can be concatenated to form a longer sentence, hopefully more "grammatical." Here as follows is…

python nlp language-model kenlm

asked Apr 02 '18 at 04:17

Wei JIANG

71
4

4

votes

1 answer

Extract word/sentence probabilities from lm_1b trained model

I have successfully downloaded the 1B word language model trained using a CNN-LSTM (https://github.com/tensorflow/models/tree/master/research/lm_1b), and I would like to be able to input sentences or partial sentences to get the probability of each…

python tensorflow nlp lstm language-model

asked Nov 17 '17 at 23:21

Matt

53
4

Questions tagged [language-model]