For distributions like the Normal, it is easy to see what the probability that a r.v. will be towards the “tail end” of the distribution via $σ_{2}$ and the 68-95-99.7 rule. However, how do we get an estimate on the probability for *any arbitrary* distribution?

There are two important inequalities that provide us *upper bounds* on the tail probabilities.

## Markov’s inequality

Let $X$ be a r.v. with a finite expectation. For any constant $a>0$, we have

$P(∣X∣≥a)≤aE(∣X∣) $## Chebyshev’s inequality

Suppose $X$ is a r.v. with $E(X)=μ$ and $Var(X)=σ_{2}$. For any constant $a>0$, we have

$P(∣X−E(X)∣≥a)≤a_{2}Var(X) ⟺P(∣X−μ∣≥a)≤a_{2}σ_{2} $To prove, we first square both sides of $P(∣X−μ∣≥a)$, giving us $P([X−μ]_{2}≥a_{2})$. These two probabilities are equivalent. Plugging this into Markov’s inequality, we get

$P(∣X−μ∣≥a)=P([X−μ]_{2}≥a_{2})≤a_{2}E([X−μ]_{2}) =a_{2}σ_{2} $### Alternate form

Assume that $σ_{2}=Var(X)>0$. Then for any constant $c>0$, we have

$P(∣X−μ∣≥cσ)≤c_{2}1 $This is useful because it sets up an inequality similar to the 68-95-99.7 rule. For example, we have

- $P(∣X−μ∣≥2σ)≤41 =0.25$
- $P(∣X−μ∣≤3σ)≤91 ≈0.11$

If $X∼N$, then we can see that both of the above are true: when $X$ is within 2 standard deviations, this means this encompasses 95% of the probability, so we are outside $2σ$ with a probability of 0.05. This is indeed less than $41 $. The same for $3σ$.