Perplexity

Welcome to the first issue of One Minute NLP! The focus of this issue is perplexity, a metric commonly used to evaluate language models.

Jun 28, 2024

Perplexity

Perplexity (usually abbreviated PP or PPL) is a metric commonly used to evaluate language models. Perplexity measures how uncertain (or “perplexed”) a model is about the predictions it makes. The lower the perplexity, the better a model predicts the test set. Perplexity usually correlates well with improvements on real world tasks, but it is not a guarantee of better task performance. The perplexity of two models is only comparable if they use the same vocabularies.

Given an example text sequence X: The quick fox jumps over the lazy dog, we can calculate PP using the probability (log-likelihood) of predicting the next word given the words that came before it:

\(PP(X)=2^{-l},\)

where

\(l = \frac{\log{p(\textup{The})}+\log{p(\textup{quick}|\textup{The})}+\log{p(\textup{fox}|\textup{The quick})}+\cdots}{N}\)

and N is the number of words in the sequence (N=8 for our test sequence).

Formally:

\(PP(X)=2^{-\frac{1}{N}\sum^{N}_{i}{\log{p(x_i|x_1,x_2,\cdots,x_{i-1})}}}\)

Do you want to learn more NLP concepts?

Each week I pick one core NLP concept and create a one-slide, one-minute explanation of the concept. To receive weekly new posts in your inbox, subscribe here:

Reach out to me:

Connect with me on LinkedIn
Read my technical blog on Medium
Or send me a message by responding to this post

Is there a concept you would like me to cover in a future issue? Let me know!

One Minute NLP

Perplexity

Welcome to the first issue of One Minute NLP! The focus of this issue is perplexity, a metric commonly used to evaluate language models.

Perplexity

Further Reading

Do you want to learn more NLP concepts?

Discussion about this post