- We will start by looking at what is meant by psychology statistical tests.
- Then, we will look at the statistical test importance.
- After, we will delve into the parametric statistical test and the types of parametric tests.
- Following this, we will explore non-parametric tests.
Psychology Statistical Test
Some examples of statistical tests include the chi-square test, Pearson’s correlation or the Sign Test.
Statistical tests in psychology analyse data from experiments that allow researchers to identify if the observed scores significantly (not due to chance) from the hypothetical results.
There are parametric (for when data is normally distributed) and non-parametric (for when our data is non-evenly distributed).
Statistical Test Importance
Hypothesis testing statistics is when statistical tests are used in experimental research to identify if the alternative or null hypothesis should be accepted in research.
If the findings are significant, the alternative hypothesis should be accepted, and the null hypothesis should be rejected.
Statistical tests allow researchers to identify if the results are due to chance or if they are a result of the study and will enable the researcher to compare to previous findings.
Parametric Statistical Tests
Parametric tests are a type of statistical test used to test hypotheses. A criterion for the data needs to be met to use parametric tests. The criteria are:
Data must be normally distributed.
Homogeneity of variance – the amount of ‘noise’ (potential experimental errors) should be similar in each variable and between groups.
There should be no extreme outliers.
Independence – the data from each participant in each variable should not be correlated. Measurements from a participant should not be influenced or associated with other participants.
Types of Statistical Tests
Statistical tests are either parametric or non-parametric.
Some examples of parametric tests are as follows:
And some examples of non-parametric tests are as follows:
Friedman’s, Spearman’s, signed-rank and Mann-Whitney U.
Psychology Statistical Tests: Types of Parametric Tests
There are several types of parametric tests, and the one that is used depends on what the researcher is trying to investigate:
Parametric test | What it measures? | Example research scenario |
Correlation | The relationship (strength and direction) between two variables | The relationship between fitness test scores and the number of hours spent exercising |
Paired t-test | Compares the mean value of two variables obtained from the same participants | The difference in depression scores in a group of patients before and after treatment |
Unpaired t-test | Compares the mean value of a variable measured from two independent (different groups) | The difference between depression symptom severity in a placebo and drug therapy group |
One-way Analysis of Variance (ANOVA) | Compares the mean of two or more independent groups (uses a between-subject design, and the independent variable needs to have three or more levels) | The difference in average fitness test scores of individuals who frequently exercise, moderately, or do not exercise |
One-way Repeated Measures (ANOVA) | Compares the mean of three or more conditions when the participants are the same in each group (uses a within-subject design, and the independent variable needs to have three or more levels) | The difference in average fitness test scores during the morning, afternoon and evening |
Non-Parametric Tests
Non-parametric tests can be used when data is not normally distributed. There are several non-parametric tests. One we will be looking at here is the sign test.
Statistical Tests: Sign Test
The sign test is used for within-group studies (only one group of participants). However, two groups of participants could be used under a ‘matched-pairs’ design. The sign test assesses the difference between two conditions used on categorical data.
Statistical Tests: Sign Test Calculation
Let us look at how to calculate a sign test step-by-step with an example.
Here are the study results:
Participant | Anxiety score before CBT | Anxiety score after CBT |
1 | 25 | 22 |
2 | 36 | 21 |
3 | 20 | 24 |
4 | 40 | 30 |
5 | 17 | 19 |
6 | 20 | 20 |
7 | 26 | 23 |
8 | 27 | 34 |
9 | 25 | 25 |
10 | 28 | 28 |
1. Work out the difference between the two sets of data (it doesn’t matter which column is added/subtracted from which, the data will still end up with the same results).
Participant | Anxiety score before CBT | Anxiety score after CBT | Difference |
1 | 25 | 22 | -3 |
2 | 36 | 21 | -15 |
3 | 20 | 24 | +4 |
4 | 40 | 30 | -10 |
5 | 17 | 19 | +2 |
6 | 20 | 20 | 0 |
7 | 26 | 23 | -3 |
8 | 27 | 34 | +7 |
9 | 25 | 25 | 0 |
10 | 28 | 28 | 0 |
2. Add the total number of ‘+’ and ‘-’. Ignore the data where there is no difference (i.e., the difference of 0).
For our data, we have the following:
+ = 3
- = 4
3. The less frequent sign is the ‘S-value’.
Here the S-value = 3 (the + was the less frequent sign, and the + had a total of 3)
Find out the N value (number of participants, not including those with a difference of 0).
Here the N value is 10 - 3 = 7 (we had 10 participants minus the 3 that had a difference of 0)
4. Compare the calculated S-value to the critical value to determine if it is significant. A critical values table will always be given to you in an exam.
Level of significance for a one-tailed test |
| .05 | .025 | .01 | .005 |
Level of significance for a two-tailed test |
| .10 | .05 | .02 | .01 |
N | | | | |
5 | 0 | | | |
6 | 0 | 0 | | |
7 | 0 | 0 | 0 | |
8 | 1 | 0 | 0 | 0 |
9 | 1 | 1 | 0 | 0 |
10 | 1 | 1 | 0 | 0 |
11 | 2 | 1 | 1 | 0 |
12 | 2 | 2 | 1 | 1 |
13 | 3 | 2 | 1 | 1 |
14 | 3 | 2 | 2 | 1 |
15 | 3 | 3 | 2 | 2 |
16 | 4 | 3 | 2 | 2 |
17 | 4 | 4 | 3 | 2 |
18 | 5 | 4 | 3 | 3 |
19 | 5 | 4 | 4 | 3 |
20 | 5 | 5 | 4 | 3 |
25 | 7 | 7 | 6 | 5 |
30 | 10 | 9 | 8 | 7 |
35 | 12 | 11 | 10 | 9 |
- Is the test one-tailed or two-tailed? In our example, our study is two-tailed as we wanted to see if there was a difference either way (positive or negative).
- What is the significance level? Unless specified, the significance level is always .05
- How many participants are there? The N-value. For our example, it is 7.
From the table, our critical value is 0.
The calculated value (S-value) must be equal to or less than the critical value to be significant.
Our results are insignificant because the calculated value (3) is greater than the critical value (0).
You could write your answer up as:
The calculated value (S=3) is greater than the critical value of 0. Therefore, the difference in anxiety scores before and after cognitive behavioural therapy is insignificant.
S(3) > 0 (critical value)
What makes a test statistically significant?
Significance tests provide researchers with a statistical value used to measure how likely the results from research are due to chance. If the value is lower than .05, the results are statistically significant.
Errors in Hypothesis Testing
Researchers can sometimes make a Type 1 or Type 2 error when conducting research. When either error occurs in research, then it lacks validity.
Type 1 error: rejecting the null hypothesis when it is true (false positive), which happens when the researcher identifies that their data is significant when it is not.
Type 2 error: mistakenly accepting the null hypothesis and rejecting the alternative hypothesis when it is true.
Statistical Tests - Key takeaways
- Statistical tests are tests that are used to analyse data from experiments.
- There are two types of tests; parametric and non-parametric tests. Parametric tests are used on normally distributed data, and non-parametric tests are on data that is not normally distributed.
- The sign test is non-parametric.
- The sign test is used for within-group studies (only one group of participants). However, two groups of participants could be used under a ‘matched-pairs’ design. The sign test assesses the difference between two conditions used on categorical data.
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel