Quantiles are very useful tool in statistics. We use them to summarize a group of numbers. For instance, if we have a big list of numbers, we simply focus on few numbers.
Intuitively speaking, ‘quantile’ means that the sample is divided into equal – sized parts.
Definition
Let $y_{1}, \cdots, y_{N}$ be a grouped statistical sequence, i.e. $y_{1} \leq \cdots y_{N}$. Let us denote
$$r = Int \left(j\frac{N}{n} + 1\right).$$
Quantiles of order $n$ are values $K_{1}, \cdots, K_{n-1}$ which we calculate using the following formula:
$$K_{j} = \begin{cases} y_{r} & \text{if $j\frac{N}{n}\notin \mathbf{N}$} \\ \frac{y_{r-1} + y_{r}}{2} & \text{if $j \frac{N}{n} \in \mathbf{N}$} \end{cases}, j = 1, \cdots, n-1.$$
Quantiles of order $n$ determine $n$ intervals: $\left[y_{1}, K_{1}\right>, \left<K_{1}, K_{2}\right> \cdots, \left<K_{n-1}, y_{N}\right]$.
Furthermore, in each of these intervals are less than or equal to $\frac{100}{n}\%$ of values of a sequence.
Special types of quantiles
Quantile of order $2$ is a median, quantiles $Q_{1}, Q_{2}, Q_{3}$ of order $4$ are called quartiles, quantiles of order $10$ are called deciles and quantiles of order $100$ are called percentiles.
In other words, quartiles divide the distribution into $4$ equal parts, deciles into $10$ equal parts and percentiles into $100$ equal parts.
Notice that the second quartile always corresponds to the median of the given set.
$Q_{1}$ is called the lower quartile and $Q_{3}$ the upper quartile.
The lower quartile is the middle value of the lower half.
The upper quartile is the middle value of the upper half.
The deciles are $9$ values which split the data set into $10$ equal – sized parts.
Quartiles are special cases of percentiles. The $25$ – th percentile is also called the first quartile. The $50$ – th percentile is also called the median. The $75$ – th percentile is also called the third quartile.
The percentiles of a distribution are $99$ values which split the data set into $100$ equal – sized parts. A percentile gives us information about what number is higher than a certain percent of the rest of the dataset. For instance, the ”$60th$ percentile” means that the number is higher than $60 \%$ of the other given numbers.
Percentiles are often used to report scores in test. For example, if you are at the $70$ – th percentile, it means that your score was better than $70 \%$ of test takers.
In addition, here is the list of some other specific quantiles:
Terciles – quantiles of order $3$
Quintiles – quantiles of order $5$
Sextiles – quantiles of order $6$
Septiles – quantiles of order $7$
Octiles – quantiles of order $8$
Duodeciles – quantiles of order $12$
Vigintiles – quantiles of order $20$
Permilles – quantiles of order $1000$
Examples
Example 1: Find the quartiles for the following data: $-1, -3, 0, -1, -1, 5, 0, -3, 1, 2, 3, 3$.
Solution:
First, we need to put the list of given numbers in order: $-3, -3, -1, -1, -1, 0, 0, 1, 2, 3, 3, 5$.
Furthermore, $N = 12, n = 4$.
From $\frac{N}{4} = 3, 2\frac{N}{4} = 6, 3\frac{N}{4} = 9$ we get
$$Q_{1} = \frac{y_{3}+y_{4}}{2} = -1, Q_{2} = M_{e} = \frac{y_{6} + y_{7}}{2} = 0, Q_{3} = \frac{y_{9} + y_{10}}{2} = 2.5.$$
Example 2: Find the deciles $D_{1}, D_{3}$ and $D_{8}$ for the following data: $22, 20, 24, 30, 32, 28, 35$.
Solution:
First, we need to put the list of given numbers in order: $20, 22, 24, 28, 30, 32, 35$.
Furthermore, $N = 7, n = 10$.
From $\frac{N}{10} = \frac{7}{10} = 0.7 \notin \mathbf{N}$ we get $r = Int(0.7) + 1 = 0 + 1 = 1$ and
$$D_{1} = y_{1} = 20.$$
From $3\frac{N}{10} = 3\frac{7}{10} = 2.1 \notin \mathbf{N}$ we get $r = Int(2.1) + 1 = 2 + 1 = 3$ and
$$D_{3} = y_{3} = 24.$$
From $8\frac{N}{10} = 8\frac{7}{10} = 5.6 \notin \mathbf{N}$ we get $r = Int(5.6) + 1 = 5 + 1 = 6$ and
$$D_{8} = y_{6} = 32.$$
Quantiles for grouped data
If a distribution of numeric variable is grouped in classes, then the $j – th$ quantile class of order $n$ is defined as first class $[L_{1}, L_{2}]$ whose cumulative frequency is greater than or equal to $j \frac{N}{n}$.
If $f_{kvant}$ is a frequency of the $j – th$ quantile class, $l$ its size, and $F(L_{1})$ cumulative frequency (the sum of all frequencies) before $j – th$ quantile class, then the $j – th$ quantile is estimated with a value
$$K_{j} = L_{1} + \frac{j\frac{N}{n} – F(L_{1})}{f_{kvant}}l. \ (*) $$
Example 3: Salaries of employees of a certain company are grouped in classes and shown in the table below. Calculate the first, second and third quartile class. Interpret the results.
Solution:
Notice that instead of (cumulative) frequencies $f_{i}$ we can observe (cumulative) percentages $p_{i}100 = \frac{f_{i}}{N}100$. Therefore, multiplying the numerator and denominator from the formula (*) by $\frac{100}{N}$, we get
$$K_{j} = L_{1} + \frac{j \frac{N}{n}\frac{100}{N} – \frac{F(L_{1})}{N}100}{\frac{f_{kvant}}{N}100}l = L_{1} + \frac{j \frac{100}{N} – \frac{F(L_{1})}{N}100}{p_{kvant}100}l,$$
where $p_{kvant}$ is a proportion of the j – th quantile class of order n, i.e. proportion of the first class whose cumulative percentage is greater than or equal to $j \frac{100}{n}$.
From $\frac{N}{4} = 25, 2 \frac{N}{4} = 50, 3 \frac{N}{4} = 75$ we see that the first quantile class is $1500.5 – 1700.5$, the median class is $1700.5 – 1900.5$ and the third quantile class is $1900.5 – 2100.5$. Furthermore,
$$Q_{1} = 1500.5 + \frac{25 – 21.7}{16.5}200 = 1540.5$$
$$M_{e} = 1700.5 + \frac{50 – 38.2}{23.8}200 = 1799.7$$
$$Q_{3} = 1900.5 + \frac{75 – 62}{14.9}200 = 2075.$$
In conclusion, up to a quarter of employees have salary less than $1540.5$ €, up to half of the employess have salary less than $1799.7$ €, while up to a quarter of employees have salary higher than $2075$ €.