Questions tagged [perplexity]

Perplexity is a measurement of how well a probability distribution or probability model predicts a sample.

In information theory, perplexity is a measurement of how well a probability distribution or probability model predicts a sample. It may be used to compare probability models. A low perplexity indicates the probability distribution is good at predicting the sample.

40 questions

vote

1 answer

Elbow/knee in a curve in R

I've got this data processing: library(text2vec) ##Using perplexity for hold out set t1 <- Sys.time() perplex <- c() for (i in 3:25){ set.seed(17) lda_model2 <- LDA$new(n_topics = i) doc_topic_distr2 <- lda_model2$fit_transform(x = dtm, …

r plot text2vec perplexity

asked Oct 28 '19 at 23:57

MelaniaCB

vote

2 answers

Gensim Topic Modeling with Mallet Perplexity

I am topic modelling Harvard Library book title and subjects. I use Gensim Mallet Wrapper to model with Mallet's LDA. When I try to get Coherence and Perplexity values to see how good the model is, perplexity fails to calculate with below…

python gensim topic-modeling mallet perplexity

asked Mar 21 '19 at 10:47

Tolga

vote

2 answers

Python: handling large numbers

I need to count perplexity and I try to do it with def get_perplexity(test_set, model): perplexity = 1 n = 0 for word in test_set: n += 1 perplexity = perplexity * 1 / get_prob(model, word) perplexity =…

python optimization largenumber perplexity

asked Dec 16 '18 at 17:53

Petr Petrov

4,090
10
31
68

vote

1 answer

Check perplexity of a Language Model

I created a language model with Keras LSTM and now I want to assess wether it's good so I want to calculate perplexity. What is the best way to calc perplexity of a model in Python?

keras nlp lstm language-model perplexity

asked Nov 28 '18 at 08:56

Cranjis

1,590
8
31
64

vote

0 answers

Sk-learn LDA for topic extraction, perplexity and score

Hello all! As apart of a project, I need to build a text classifier with the labeled data I have. A data point is composed of a single sentences and one of 3 categories for each sentence. I have extracted 5 topics from this database with LDA. What…

scikit-learn text-classification lda supervised-learning perplexity

asked Dec 04 '17 at 11:38

day_dreamer

vote

1 answer

Perplexity calculations rise between each significantly drop

I am training a conversational agent using LSTM and tensorflow's translation model. I use batchwise training, resulting in a significant drop in the training data perplexity after each epoch start. This drop can be explained by the way I read data…

machine-learning tensorflow training-data perplexity

asked Jun 03 '17 at 12:03

simejo

votes

1 answer

Challenges when calculating perplexity: using bidirectional models, and dealing with large text size and values, are my approaches reasonable?

Challenges when calculating perplexity: is my approach reasonable? I am trying to find a pre-trained language model that will work best for my text. The text is pretty specific in its language and content but there's no test data avaiable or budget…

nlp huggingface-transformers perplexity

asked Jun 04 '23 at 10:05

Agnes

votes

0 answers

Laplace Smoothing - Greater perplexity of model language when increase the N of N-Gram Model

I'm training a Language Model using NLTK library of Python. To obtain a better result, I use the Laplace smoothing technique. But when I increase the N of N-gram model, my perplexity increases too, and I was expecting that the perplexity would…

nltk smoothing n-gram language-model perplexity

asked Mar 30 '23 at 01:45

Leticia

votes

0 answers

Negative Perplexity while using gensim LDA

I am using gensim's LDA and trying to see the perplexity for a certain number of topics. Perplexity for 1 : -7.903370624873305 Coherence Score for 1 : 0.8044880331838007 Perplexity for 2 : -8.269065851934347 Coherence Score for 2 : …

nlp gensim lda logarithm perplexity

asked Nov 17 '22 at 10:15

Stayne

votes

0 answers

Relation between perplexity and number of training samples

I'm trying to calculate the perplexity of some English language texts using NLTK. I'm trying to figure out how a simple n-gram model will perform with less training samples. The thing I don't understand is why does perplexity get lower if I decrease…

nltk language-model perplexity

asked Nov 12 '22 at 09:08

mrpostman889

votes

0 answers

How to choose the best LDA model when coherence and perplexy show opposed trends?

I have a corpus with around 1,500,000 documents of titles and abstracts from scientific research projects within STEM. I used Mallet https://mimno.github.io/Mallet/transforms to fit models from 10 to 790 topics in 10 topics increments (I allow for…

lda mallet perplexity

asked Nov 07 '22 at 14:10

fcbt

votes

1 answer

What is the held-out probability in Mallet LDA? How can we calculate Perplexity by the held-out probability?

I am new to mallet. Now I would like to get the perplexity scores for 10-100 topics in my lda model so I run the held-our probability, it gives me the value of -8926490.73103205 for topic=100, which seems a little bit off. Is that the perplexity…

lda mallet perplexity

asked Oct 31 '22 at 23:50

May3514

votes

0 answers

Comparing Perplexities of different N-gram Models

In my problem, I'm trying to compare the perplexity values of different N-gram models, say till N=4. However, I'm confused with the other results obtained using other methods. Here is my first implementation: - import nltk …

python nlp nltk n-gram perplexity

asked Oct 08 '22 at 08:40

kartikeya saraswat

votes

1 answer

How to calculate perplexity of BERTopic?

Is there a way to calculate the perplexity of BERTopic? I am unable to find any such thing in the BERTopic library and in other places.

bert-language-model topic-modeling perplexity

asked Aug 16 '22 at 06:29

Inaam Ilahi

votes

2 answers

How to find perplexity of bigram if probability of given bigram is 0

Given the formula to calculate the perplexity of a bigram (and probability with add-1 smoothing), Probability How does one proceed when one of the probabilities of the word per in the sentence to predict is 0? # just examples, don't mind the…

python nlp n-gram perplexity

asked Mar 31 '21 at 14:55

axelmukwena

Prev 1

3 Next