- We will start by learning the definition of descriptive statistics.
- Moving on from this, we will look at descriptive statistics in psychology, and we will cover some descriptive statistics examples.
- Then, we will look at some descriptive statistical analyses used in psychology.
- Last, you will learn the difference between descriptive and inferential statistics.
Descriptive Statistics Definition
Descriptive statistics allow researchers to create a preliminary summary of the raw data. For this reason, they can also be referred to as summary statistics. Descriptive statistics are the first step in data analysis and provide valuable information for choosing the correct statistical test.
Descriptive statistics is a form of statistical analysis utilised to summarise a dataset.
As you can probably deduce from its name, descriptive statistics describe the main aspects of the data. They are beneficial as they provide researchers with information about potential relationships between variables and information regarding which statistical tests would be appropriate for testing the proposed hypothesis.
Researchers, however, do not conclude from descriptive statistics. To go beyond description and infer results, researchers use inferential statistics.
Descriptive statistics describe raw data, and inferential statistics make predictions about a larger population.
Descriptive Statistics: Psychology
Descriptive statistics are usually presented graphically, either on tables, frequency distributions, histograms, or bar charts.
Generally, descriptive statistics are used in psychology research to summarise datasets.
However, descriptive statistics cannot be used to make inferences or generalisations about broader populations.
Descriptive Statistical Analysis: Measures of Frequency Distribution
The frequency distribution describes the number of observations for a possible variable value. This information is often displayed in frequency tables.
Imagine a study looking into the relationship between two variables: hair colour and nationality. A frequency table would look like this:
Hair Colour | Frequency | | Nationality | Frequency |
Black | 7 | | Ireland | 5 |
Brown | 6 | | England | 15 |
Blonde | 14 | | Wales | 5 |
Ginger | 3 | | Scotland | 5 |
Total Sample | 30 | | Total Sample | 30 |
From such a table, researchers can state that 14 individuals within the sample were blonde and that 5 were Irish.
Descriptive Statistical Analysis: Measures of Central Tendency
There are many different statistical tests used to measure central tendency. Measures of central tendency give a single value that is an average of the entire dataset, this is beneficial for large datasets. The three most commonly used are: mean, median and mode.
- Mean: adding all the values together and dividing by the total number of values
- Median: placing the dataset values in numerical order and identifying which is the middle number
- Mode: most common value in the dataset
Let's consider an example to understand central tendency. Imagine a study looking into the relationship between exam performance and revision time.
The raw data of 10 participants may look like this:
Participant Number | Time Revised (in hours) |
1 | 6 |
2 | 3 |
3 | 7 |
4 | 5 |
5 | 6 |
6 | 9 |
7 | 5 |
8 | 6 |
9 | 6 |
10 | 4 |
The mean (M) is the number one gets by adding all values together and dividing them by the total number. In this example, the mean amount of hours the sample studied is:
(6 +3 + 7 + 5 + 6 + 9 + 5 + 6 + 6 + 4) / 10 = 5.7 hours.
When reporting more than one means, it is written the following way: The average score of revision time among medical students was higher (M = 8.7) than in philosophy students (M = 5.6).
The numbers need to be placed in sequential numerical order to find the median, and the median is the middle number. In this case, it is:
3, 4, 5, 5, 6, 6, 6, 6, 7, 9. In this example, the median is 6
The mode refers to the most popular score in the data, which in this example is six because it reflects the data of 4 participants.
Descriptive Statistics Examples: Measures of Variability or Dispersion
Measures of variability are meant to describe the amounts of differences within the data set. It's somehow the opposite of the central tendency.
There are four types of variability measures:
- Range: the highest value minus the smallest value
- Interquartile range: the difference between the median value calculated in the second half and first half of a dataset
- Standard deviation (sd): the average distance of a data point from the mean
- Variance: also measures the average distance of a data point from the mean, but it is calculated differently
Let's consider the example above.
Participant Number (N) | Time Revised (in hours) |
1 | 6 |
2 | 3 |
3 | 7 |
4 | 5 |
5 | 6 |
6 | 9 |
7 | 5 |
8 | 6 |
9 | 6 |
10 | 4 |
The range would be the highest score, 9, minus the lowest score, 3. Therefore, the range in this example is 9 - 3 = 6.
The interquartile range is the difference between the median values calculated in a dataset's first half and second half. The first half of the dataset would be 3, 4, 5, 5, and 6, while the second would be 6, 6, 6, 7, and 9. The median of the first half is 5, and the median of the second half is 6. Therefore the interquartile range is 6 - 5 = 1.
The standard deviation and the variance are slightly more complex to calculate; they measure the distance of a given data point from the mean.
A small variance or standard deviation suggests that the scores do not vary too much from the mean. On the contrary, a high variance or standard deviation indicates that the data is widely spread from the mean.
When writing psychology reports, the mean and standard deviation are the most commonly reported descriptive statistic.
Descriptive Statistics Examples: Measures of Position
A measure of position identifies the position of a given value from the other values. Quartiles and percentiles are used to measure position.
Percentiles, for example, divide the data into four categories: the 25th, the 50th and the 75th percentile. When calculating percentiles, values need to be put in ascending order. In this way, researchers can establish which scores are associated with the different percentiles.
Quantiles are measured by numerically ordering values in ascending order. Quantiles separate populations/samples into intervals of equal sizes; this is done so that ranking of specific data points can be identified.
This data provides information about the distribution of data, which is crucial for later statistical analyses. If data is skewed, non-parametric tests may be used for statistical analysis.
Descriptive and Inferential Statistics
As you learned, descriptive statistics offer information about a specific dataset. And while these are helpful, psychologists also need other statistical tests to draw conclusions. For this, psychologists use inferential statistics. These are based on probabilities and let researchers test hypotheses and draw conclusions about populations.
Let's consider studying ice cream consumption rates across the year. Descriptive statistics may suggest that more ice cream is consumed in July than in January. And although it may be tempting to conclude that ice cream consumption is lower in January compared to July, this would not be accurate.
In order to make such a statement, one would need to test whether there is a significant difference between the means of ice cream consumption in both months. And this can only be achieved through inferential statistics.
Descriptive Statistics - Key takeaways
- Descriptive statistics is a form of statistical analysis utilised to summarise a dataset.
- There are four main types of descriptive statistics: measures of frequency, central tendency, variability or dispersion, and measures of position.
- The most commonly reported descriptive statistics are the mean and range.
- Descriptive and inferential statistics have different uses, e.g. the first is used to summarise data and the latter is used to make inferences.
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel