Skip to main content

Bayes Theorem

In many applications, it can be difficult to directly calculate a conditional probability. The problem is that it is not always easy to determine a complete stochastic truth table. For example, suppose that you receive an email with the phrase "free money" in the subject, and you are wondering whether the email is spam. Let HH mean "the email is spam" and EE mean "the phrase 'free money' is in the subject of the email". You want to determine whether the evidence EE evidentially supports the hypothesis HH. To do this, you need to determine Pr(H∣E)Pr(H\mid E). It is not obvious what probabilities you should assign to each row of a stochastic truth table for the atomic propositions HH and EE. There is an indirect way to determine this conditional probability.

Since Pr(H∣E)=Pr(H∧E)Pr(E)Pr(H\mid E)=\frac{Pr(H\wedge E)}{Pr(E)}, we have that

Pr(H∣E)Pr(E)=Pr(H∧E).Pr(H\mid E)Pr(E) = Pr(H\wedge E).

Since Pr(E∣H)=Pr(E∧H)Pr(H)Pr(E\mid H)=\frac{Pr(E\wedge H)}{Pr(H)}, we have that

Pr(E∣H)Pr(H)=Pr(E∧H).Pr(E\mid H)Pr(H) = Pr(E\wedge H).

Now, since H∧EH\wedge E and E∧HE\wedge H are tautologically equivalent, we have that

Pr(H∣E)Pr(E)=Pr(H∧E)=Pr(E∧H)=Pr(E∣H)Pr(H)Pr(H\mid E)Pr(E) = Pr(H\wedge E) = Pr(E\wedge H) = Pr(E\mid H)Pr(H)

Dividing both sides by Pr(E)Pr(E), we have that:

Pr(H∣E)=Pr(E∣H)Pr(H)Pr(E)Pr(H\mid E) = Pr(E\mid H)\frac{Pr(H)}{Pr(E)}

So, we can determine Pr(H∣E)Pr(H\mid E) if we know three other probabilities 1. Pr(E∣H)Pr(E\mid H), 2. Pr(H)Pr(H) and 3. Pr(E)Pr(E). These are all probabilities that we can estimate with some investigation.

  1. Pr(E∣H)Pr(E\mid H): This is the probability that the phrase "free money" occurs in the subject line of spam email. That is, Pr(E∣H)Pr(E\mid H) is the probability that the subject line contains the phrase "free money" assuming that the email is spam. The phrase "free money" is in the list of spam trigger words. There are other phrases that might occur in spam email. A good estimate for this conditional probability is Pr(E∣H)=0.1Pr(E\mid H)=0.1.
  2. Pr(H)Pr(H): This is the prior probability of receiving spam email. Roughly, 55% of email received each day is classified as spam. That is, Pr(H)=0.55Pr(H)=0.55.
  3. Pr(E)Pr(E): This is the prior probability of receiving an email with the phrase "free money" in the subject line. While it is not obvious how to estimate this probability directly, we can use the law of total probability: Pr(E)=Pr(H)Pr(E∣H)+Pr(¬H)Pr(E∣¬H)Pr(E)=Pr(H)Pr(E\mid H) + Pr(\neg H)Pr(E\mid \neg H). We have already determined that Pr(E∣H)=0.1Pr(E\mid H)=0.1 and Pr(H)=0.55Pr(H)=0.55. Thus, using the complement law, we know that Pr(¬H)=0.45Pr(\neg H)=0.45. The only thing that remains is to estimate Pr(E∣¬H)Pr(E\mid \neg H). That is, assuming an email is not spam, what is the probability that phrase "free money" occurs in the subject line. It is very unlikely that I would receive an email with the phrase "free money" in the subject line (your estimate of this conditional probability may be different). My estimate of this conditional probability is Pr(E∣¬H)=0.001Pr(E\mid \neg H)=0.001. Then, using the law of total probability, we have
    Pr(E)=Pr(H)Pr(E∣H)+Pr(¬H)Pr(E∣¬H)Pr(E)=Pr(H)Pr(E\mid H) + Pr(\neg H)Pr(E\mid \neg H)
    =0.55∗0.1+0.45∗0.001=0.05545= 0.55 * 0.1 + 0.45 *0.001=0.05545

Putting everything together, we have that:

Pr(H∣E)=Pr(E∣H)Pr(H)Pr(E)=0.10.550.05545≈0.992Pr(H\mid E)=Pr(E\mid H)\frac{Pr(H)}{Pr(E)}=0.1\frac{0.55}{0.05545}\approx 0.992

The above equation is an instance of Bayes Theorem:

Bayes Theorem

For all formulas XX and YY,

Pr(X ∣ Y)=Pr(Y ∣ X)Pr(X)Pr(Y)Pr(X\ |\ Y) = Pr(Y\ |\ X)\frac{Pr(X)}{Pr(Y)}

As noted above, we often use the law of total probability when applying Bayes Theorem:

Bayes Theorem, version 2

For all formulas XX and YY,

Pr(X ∣ Y)=Pr(Y ∣ X)Pr(X)Pr(X)Pr(Y ∣ X)+Pr(¬X)Pr(Y ∣ ¬X)Pr(X\ |\ Y) = Pr(Y\ |\ X)\frac{Pr(X)}{Pr(X)Pr(Y\ |\ X) + Pr(\neg X)Pr(Y\ |\ \neg X)}

Applying Bayes Theorem can be tricky. Use Bayes Theorem to solve the following puzzles:

Three Prisoner's Problem: Three prisoners A,BA, B and CC have been tried for murder and their verdicts will told to them tomorrow morning. They know only that one of them will be declared guilty and will be executed while the others will be set free. The identity of the condemned prisoner is revealed to the very reliable prison guard, but not to the prisoners themselves.

Prisoner AA asks the guard ``Please give this letter to one of my friends --- to the one who is to be released. We both know that at least one of them will be released".

An hour later, AA asks the guard ``Can you tell me which of my friends you gave the letter to? It should give me no clue regarding my own status because, regardless of my fate, each of my friends had an equal chance of receiving my letter."

The guard told him that BB received his letter.

Prisoner AA then concluded that the probability that he will be released is 1/2 (since the only ones without a verdict are AA and CC).

But, AA thinks to himself: "Before I talked to the guard my chance of being executed was 1 in 3. Now that he told me BB has been released, only CC and I remain, so my chances of being executed have gone from 33.33% to 50%. What happened? I made certain not to ask for any information relevant to my own fate..." Explain what is wrong with AA's reasoning.

Monty Hall Dilemma: Suppose you are on a game show, and you are given the choice of three doors. Behind one door is a car behind the others, goats. You pick a door, say number 1, and the host, who knows what's behind the doors, opens another door, say number 3, which has a goat. He says to you, "Do you want to pick door number 2?" Is it to your advantage to switch your choice of doors?

Trying answering the above questions before watching the following video.


Practice Questions#

  1. Suppose that Pr(B)=0.25Pr(B)=0.25, Pr(A∣B)=0.75Pr(A\mid B)=0.75, Pr(A∣¬B)=0.3Pr(A\mid\neg B)=0.3, find Pr(B∣A)Pr(B\mid A). Explain how you arrived at your answer.
  1. Suppose that Pr(P)=0.85Pr(P)=0.85, Pr(Q∣P)=0.25Pr(Q\mid P)=0.25, Pr(Q∣¬P)=0.5Pr(Q\mid\neg P)=0.5, find Pr(P∣Q)Pr(P\mid Q). Explain how you arrived at your answer.
  1. Suppose that Pr(E)=0.85Pr(E)=0.85, Pr(E∣H)=0.8Pr(E\mid H)=0.8, Pr(Q∣¬P)=0.5Pr(Q\mid\neg P)=0.5, find Pr(P∣Q)Pr(P\mid Q). Explain how you arrived at your answer.
  1. Suppose that you know  that it rains 10% of days, it is cloudy in the morning 20% of the days and when it rains in the afternoon, 50% of the time there were clouds in the morning. Suppose that you see clouds in the morning, what is the probability that it  will rain in the afternoon?
  1. Suppose we have the following information about a gene defect: 1% of people have a certain genetic defect; 90% of tests for the gene detect the defect (true positives); and 9.6% of the tests are false positives.
    If a person gets a positive test result, what are the odds they actually have the genetic defect?