Statistical spread is the degree to which a data distribution moves away from, or closer, in absolute value to the arithmetic mean, as a central position statistic.
Therefore, the dispersion measures will always accompany that mean or average.
In this way, they would be reporting the variability or dispersion of the data in relation to it. The higher the values, as we will see below, the greater the statistical dispersion.
Importance of statistical dispersion
When we want to carry out a descriptive analysis, we first calculate the summary measures of position. The most common are the mean, median, mode, or quartiles, deciles, quintiles, or percentiles. Also, we need to know the statistical spread.
The dispersion measures provide very relevant information. If the dispersion is very high, it affects the mean and this is no longer representative of the group as a summary measure. Therefore, normally both data go together.
Statistical dispersion measures
There are various measures of dispersion that allow its measurement. Let's see a summary of the most relevant. We have analyzed them in more detail here.
- Rank: It is not more than the difference between the smallest and the largest value of the distribution.
- Average deviation: It would be the equivalent to the average of the different deviations of each data with respect to the mean.
- Variance and standard deviation: They are the best known dispersion measures. The second one that is easier to calculate (root of the variance) and to interpret is usually used. They are expressed in absolute values.
- Variation coefficient: In this case, it is calculated with the standard deviation and the mean, and is used for comparison, since it is expressed in relative values (%).
Statistical dispersion example
Finally, we are going to see an example of ten fictitious countries and their GDP.
We can see that they are very different when it comes to their GDP. From the largest, with 7,000 million units, to the smallest, with 2,500 million.
We see that the average is almost 4,500 million, but the dispersion measures are very high. On the one hand, the average deviation, of almost 1,500 million units. The variance, which does not contribute much, but allows the calculation of the standard deviation of almost 1,500 million units. Finally, a coefficient of variation of almost 33%.
We can say that the statistical dispersion is very high and the mean is not representative. Something that can be verified because there are few data and countries with a high GDP and others with a low one are observed. But imagine the 194 recognized by the UN, there they are quite useful, right?