Example with numbers
This example with numbers is meant as an intuitive way to understand how Bayes' formula works:
Let's assume we have 10.000 typical reviews. We calculate what we would expect to happen with these 10.000 reviews:
To predict how many review are classified as fake:
- Of the 9800 real ones, 10% are classified as fake →
9800 * 0.10 = 980
- Of the 200 fake ones, 95% are classified as fake →
200 * 0.95 = 190
980 + 190 = 1.170
are classified a fake.
Now we have all the pieces we need to calculate the probability that a reviews is fake, given that it is classified as such:
- All reviews that are classified as fake →
1.170
- Of those, are actually fake →
190
190 / 1170 = 0.1623
or 16.23%
Using general Bayes' theorem
Let's set up the events. Note that my version of event B
is slightly different from yours.
P(A)
: Real review
P(A')
: Fake review
P(B)
: Predicted real
P(B')
: Predicted fake
P(A'|B')
: Probability that a review is actually fake, when it is predicted to be real
Now that we have our events defined, we can go ahead with Bayes:
P(A'|B') = P(A' and B') / P(B') # Bayes' formula
= P(A' and B') / (P(A and B') + P(A' and B')) # Law of total probability
We also know the following, by an adapted version of Bayes' rule:
P(A and B') = P(A) * P(B'|A )
= 0.98 * 0.10
= 0.098
P(A' and B') = P(A') * P(B'|A')
= 0.02 * 0.95
= 0.019
Putting the pieces together yields:
P(A'|B') = 0.019 / (0.098 + 0.019) = 0.1623