- We will start by looking at the definition of measures of dispersion in statistics and the importance of the measures of dispersion.
- Then, moving on to cover the standard measures of dispersion examples, including the range and standard deviation. We will follow with a step-by-step example to illustrate how to calculate the standard deviation.
- And finally, we will explore how the measure of dispersion for ordinal data can be calculated. And why this differs between interval and ratio data.
Definition of Measures of Dispersion in Statistics
The measure of dispersion is the measure of the spread of scores in a data set. It is the extent to which the values vary around the central or average value. Now let's take a look at an example.
Imagine that you are a first-year university student, and a friend asks you about the ages of people in your psychology course. You'll say: 'Well, most people are 18, a few in their 20s and two or three over 40.'
In this example, the dispersion of the age groups of people in the course is described, as the scenario described the variations/ dispersion of age groups. I.e. how much they varied from the average age of 18 (a few in their 20s, two or three over 40.)
The values in a low dispersion data set do not have much variation, e.g., 20, 22, 23, 24, 25, 27, 28. In a high dispersion data set, there is a lot of variation, e.g., 9, 10, 14, 26, 35, 37, 39. Researchers aim to gather data that has low rather than high variation.
The Importance of Measures of Dispersion
Measures of dispersion are necessary because if we don't know the dispersion, a mean value can be misleading.
Suppose there are two companies, and analysts compared their employees' wages. The average wage might be the same, but the variation or dispersion of the wages might be very different.
In Company A, all workers get a similar amount of wages. However, Company B has a large variation between the lowest-paid and the highest-paid employees.
Additionally, from the measures of dispersion, it is easier to understand if there are many outliers. If multiple figures in a dataset are largely varied from the average, then in some cases, this can be an issue. In the instance of research testing the effectiveness of interventions, if there is a lot of variation in participants' results, it suggests the intervention may not be effective.
The example highlights the practicality of the measures of dispersion and how it can help a researcher understand more about their findings.
Measures of Dispersion Examples: Range
The range is the easiest way to calculate dispersion. The range is calculated by subtracting the lowest number from the highest number in a data set.
If the highest value in a dataset is 50 and the lowest value is 12. Then the range would be calculated by completing the calculation 50 - 12. Therefore, it is 38.
The advantage of calculating the range is that the calculation accounts for extreme outliers, and is extremely easy to calculate.
However, it does come with disadvantages, such as the inclusion of extreme scores can cause researchers to establish a distorted measure of dispersion. Additionally, the range does not tell us much information about the dispersion of values between the highest and lowest scores.
Measures of Dispersion Examples: Standard deviation
The standard deviation (SD) is normally used when the mean is the measure of central tendency. The SD is a measure that calculates the distance of the individual scores from the mean of the dataset.
Normally, statistics programs can calculate the SD, but it is good to see the maths and understand how the SD is calculated; this is the formula for calculating SD:
SD = standard deviation
∑ = sum of
X = each value in the data set
x̅ = the mean
n = number of values in the sample
Measures of Dispersion Psychology: Calculating Standard deviation
Let's take a look and simplify how the standard deviation can be calculated.
Find the mean of the data set (x̅).
Subtract the mean from each value in the data set; this is the deviation from the mean (x - x̅).
Square each deviation.
Find the sum of the squared deviations (∑).
Divide this number by n-1 (the total number of values in the data set minus 1).
Find the square root of this number.
Let us try this with a data set. Suppose we have a data set of 48, 71, 34, 62, 54, and 43.
Find the mean: x̅ = (48 + 71 + 34 + 62 + 54 + 43) ÷ 6 = 52
Subtract the mean from each value in the data set:
47-52 = -5
70-52 = 18
33-52 = -19
61-52 = 9
53-52 = 1
42-52 = -10
Square each deviation: (-5) ² = 25, 18² = 324, (-19) ² = 361, 9² = 81, 1² = 1, (-10) ² = 100
Find the sum of the squared deviations: 25 + 324 + 361 + 81 + 1 + 100 = 892
Divide this number by n-1: 892 / 6-1 = 892/5 = 178.4
Find the square root of this number: √178.4 = 13.36
Thus the SD is 13.36.
For A-Level psychology, you won't be asked to calculate the SD. However, you might be asked to interpret and explain the SD for a data set.
The advantages of calculating the standard deviation are that the SD can be used to make estimations regarding the population. And the SD is the most sensitive measure of dispersion as all values in the data set are considered. Therefore, the researcher can get a more accurate representation of the dataset's measure of dispersion compared to the range.
However, the SD value can be easily distorted by extreme outliers, and when calculated manually, it is not always easy, especially on a large dataset.
The Measure of Dispersion for Ordinal Data
We have mentioned the mean, but what happens when we can't measure the mean? Research that collects ordinal data usually uses the median to calculate a dataset's centre point/ average.
First, let's recap on what ordinal data is.
Ordinal data is categorical, meaning there is an order, but we don't know the exact distance/ differences between each category.
Let's consider socioeconomic status to help you understand the concept of ordinal data. An example of ordinal data is a questionnaire asking whether respondents are from the working, middle or upper class. We know that a person from the upper class has higher status and money than the working class, but we can't tell by how much.
The mean can only be established in interval and ratio data as we can identify the numerical differences between responses. Therefore, either the range or standard deviation can be calculated.
However, the mean cannot be established from ordinal data. Therefore, the range is usually used to calculate the measure of dispersion in the dataset.
Measures of Dispersion - Key takeaways
- The measure of dispersion is the measure of the spread of scores in a data set. It is the extent to which the values vary around the central or average value.
- In a low dispersion data set, the values do not have much variation. In a high dispersion data set, there is a lot of variation. Researchers aim to gather data that has low rather than high variation.
- Common measures of dispersion examples are the range and standard deviation.
- The measure of dispersion for ordinal data often uses the range, and the measure of dispersion for interval or ratio data can be either standard deviation or the range.
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel