For distributions like the Normal, it is easy to see what the probability that a r.v. will be towards the “tail end” of the distribution via and the 68-95-99.7 rule. However, how do we get an estimate on the probability for any arbitrary distribution?

There are two important inequalities that provide us upper bounds on the tail probabilities.

Markov’s inequality

Let be a r.v. with a finite expectation. For any constant , we have

Chebyshev’s inequality

Suppose is a r.v. with and . For any constant , we have

To prove, we first square both sides of , giving us . These two probabilities are equivalent. Plugging this into Markov’s inequality, we get

Alternate form

Assume that . Then for any constant , we have

This is useful because it sets up an inequality similar to the 68-95-99.7 rule. For example, we have

If , then we can see that both of the above are true: when is within 2 standard deviations, this means this encompasses 95% of the probability, so we are outside with a probability of 0.05. This is indeed less than . The same for .