#### Probability distribution

Random variables have distributions. They can have many forms, however each practice has the one that serves it the best. For better understanding, we can visualize distribution as a tree diagram of a random variable with a little bit of a changed shape.

**Probability distribution** of discrete random variable is a list of probabilities associated with each of its possible values.

**Example.** Lets toss two coins 4 times. Let $X$ be the number of tails.

$\Omega=$set of all possible outcomes$=\{HH, HT, TH, TT\}$. The probability of each outcome $\omega \in \Omega$ is $P(\omega)=\frac{1}{4}$. We have 3 possibilities:

1. we got 0 tails

2. we got 1 tail

3. we got 2 tails

Lets calculate the probability of each one:

- The only outcome where we got no tails is $\{HH\}$ so let’s calculate its probability $$P(X=0)=P(\{HH\})=\frac{1}{4}$$
- Now for 1 tail, we have two outcomes $\{HT, TH\}$ $$P(X=1)=P(\{HT, TH\})=\frac{1}{2}$$
- For 2 tails, there is only one outcome $\{TT\}$ $$P(X=2)=P(\{TT\})=\frac{1}{4}$$

Consequently, the table of distribution would look like this :

In the top row we put the outcomes we were looking for. In the bottom row we put the probability of each outcome.

Generally, for $X(\Omega)=\{a_{1}, a_{2}, …, a_{n}, … \}$ and $P(a_{j})=p_{j}$, for $j=1, 2, 3….$ probability distribution looks like this:

Another way of representing probability distribution is by using graphs. The most popular option is **probability histogram**, but we can use whichever we want. The histogram of previous example would look like this:

The $x$-axis is value for random variable $X$, while the $y$-axis is the probability of each given value.

Other important part of random variable is its mean, which is a measure of central location, and the variance and standard deviation, which are measures of spread.

**Example.** Random variable $X$ has distribution

for $c\in \mathbb{R}$.

**(a)** Find $c$.

**(b)** Calculate the probability of $X$ assuming value between 2 and 5 (both included).

**(c)** Find the smallest $k\in \mathbb{N}$ that $P(X \leq k) \geq \frac{2}{5}$.

**(a) **Since $\sum_{i \in \mathbb{N}} p_{i}=1$ we have:

$c+2c+2c+3c+c^{2}+2c^{2}+(7c^{2}+c)=1$

$10c^{2}+9c-1=0$

When we solve this quadratic equation, we get $c_{1}=-1$ and $c_{2}=\frac{1}{10}$. However, since probability can’t be negative, the only solution is $c_{2}=\frac{1}{10}$. As a result, the distribution of $X$ looks like this:

**(b)** $P(2 \leq X \leq 5)=P(\{2, 3, 4, 5\})$ since all these outcomes are mutually exclusive the expression is equal to

$= P(X=2)+P(X=3)+P(X=4)+P(X=5)= $

$\displaystyle{=\frac{2}{10} + \frac{2}{10} + \frac{3}{10} + \frac{1}{100} = \frac{71}{100}}$

**(c)** Firstly, lets try with $k=1$.

$\displaystyle{P(X \leq 1)=P(X=1)=\frac{1}{10}}$

However, that is not greater or equal to $\displaystyle{\frac{2}{5}}$.

Secondly, for $k=2$

$\displaystyle{P(X \leq 2)=P(X=1)+P(X=2)=\frac{1}{10} + \frac{2}{10}=\frac{3}{10}}.$

Once again, the result isn’t greater or equal to $\displaystyle{\frac{2}{5}}$.

For $k=3$ we have:

$\displaystyle{P(X \leq 3)=P(X=1)+P(X=2)+P(X=3)=\frac{1}{10} + \frac{2}{10} + \frac{2}{10}= \frac{5}{10}=\frac{1}{2} > \frac{2}{5}}$

As a result, $k=3$.

#### Probability function

The another way of representing the probability distribution is by using a function that gives the probability that a discrete random variable is exactly equal to some value. That function is called **probability function** or **probability mass function** and noted with **PMF**.

For a discrete random variable $X$ it is defined as $f_{X}(x)=P(X=x)$.

**Example **Suppose $\Omega$ is the sample space of all outcomes of a single toss of a fair coin, and $X$ is the random variable defined on $\Omega$. Assigning 0 to the category “tails” and 1 to the category “heads”. Since the coin is fair, consequently the probability mass function is $$f_{X}(x)=\begin{cases} \frac{1}{2}, & \text{x $\in$ \{0,1\}}.\\ 0, & \text{otherwise}.\end{cases}$$

Similarly for rolling a fair die. $$f_{X}(x)=\begin{cases} \frac{1}{6}, & \text{x $\in$ \{1,2,3,4,5,6\}}.\\ 0, & \text{otherwise}.\end{cases}$$

Note that $\sum_{X\in \Omega} f_{X}(x)=1$

However, PMF is a little bit more complicated for the continuous random variable but that will be explained in another lesson.