Paired t-test assumptions
It is important to know when you need a paired test rather than a more standard test. If
you are checking a person both before and after a treatment, or
you are using one twin as the control and the other as the test subject,
then you would use a paired \(t\)-test.
In a paired experiment you are interested in the difference between results, rather than the results themselves.
Suppose your school gives you a pre-test, then teaches you the information, and then gives you the actual exam. The school is trying to see if the teaching is actually effective. In other words, the students are the test subjects, the treatment is the teaching, and the school is interested in the difference between the pre-test and actual exam results.
If there is no difference between the pre-test and actual exam results, the school will know they need to change how they are teaching the information.
The main assumption to use a paired \(t\)-test, other than the fact that you have paired data, is that the differences in the data are normally distributed.
Definition of paired t-test
A paired \(t\)-test, also known as a paired sample \(t\)-test, is used to compare the mean difference between pairs of measurements is zero or not.
Matched subjects, also called paired samples or matched pairs, are two measurements that are not independent of each other.
In the example above, the school would look at the pre-test score for a particular student and compare it to that student's actual exam score. Those two scores are not independent because it is the same student taking both the pre-test and the actual exam. The two scores are matched pairs.
If you had independent samples, then you would use a different hypothesis test. See the article Hypothesis Test for Two Normal Distributions in the case of independent samples.
Even though the matched pairs are not independent, the differences in the measurements must be independent. What does this mean?
In the example about the exams, you would need to assume that students are not cheating off of each other. If student A were cheating off of the exam papers from student B, then the differences in the pre-test score and the exam score for students A and B would not be independent. In that case, you could not use a paired \(t\)-test.
Since one of the assumptions to use a paired \(t\)-test is that the differences are normally distributed, you can treat the differences as if they were a random sample from a \(\text{N}(\mu,\sigma^2 )\) distribution, and then do the hypothesis test as if you had a single sample. For more information on doing this kind of hypothesis test, see the article Hypothesis Test for the Difference Between Two Means.
In general, when you do a paired \(t\)-test you will not know the population variance, and the number of matched pairs will be relatively small.
Paired vs. unpaired t-tests
It is very important to understand when you use a standard \(t\)-test versus a paired \(t\)-test. Recall that an unpaired t-test is used to compare the averages of two independent samples to determine if there is a significant difference between the two.
The key difference between paired and unpaired \(t\)-tests is that paired \(t\)-tests test for the difference between the mean of two samples.
Say you wish to know whether changing the layout of a clothing store means that more people are likely to buy from that store. You wish to compare the sales before and after changing the layout. The two sets of data are not independent (you are matching before and after sales), so a paired \(t\)-test would be used.
On the other hand, if you want to see if two different stores that have similar layouts have a similar number of people shopping in them, you would use an unpaired \(t\)-test because the samples are independent.
What about the degrees of freedom for the test?
Paired t-tests: degrees of freedom
A paired \(t\)-test works exactly the same as a regular \(t\)-test when calculating the degrees of freedom. The degrees of freedom is equal to the sample size minus \(1\): \(\upsilon =n-1\).
So what is \(n\)? In a paired \(t\)-test, the two samples taken share the same sample size, so \(n\) is just the number of matched pairs.
Paired t-test formula
Of course, it helps to have a more formal definition of the formula for a paired \(t\)-test.
In a paired experiment where \(n\) is small and \(\sigma ^2\) is unknown, then if the difference between two population means, \(D\), is distributed as \(\text{N}(\mu _D, \sigma ^2)\), then
\[t=\dfrac{\bar{D}-\mu _D}{\dfrac{S}{\sqrt{n}}}\sim t_{n-1}\]
where \(\bar{D}\) is the mean of the differences between the two samples.
The key thing here is that you will need to take the average of the differences rather than the average of the actual samples.
Paired sample t-test examples
Let's look at a couple of examples.
Suppose you are trying to see if a medicated skin lotion works better than a non-medicated one. So you collect a group of \(20\) people with dry skin on their feet. For one week they rub medicated skin lotion on their left foot, and non-medicated skin lotion on their right foot. At the end of the week, you check the dryness level of each foot. Is this a situation in which you would use a paired \(t\)-test?
Solution
Notice that the sample size is relatively small, and you do not know the variance of the populations. So a \(t\)-test is indicated. The question is whether you would use a paired \(t\)-test or not.
You are checking the dryness level of the left and right foot on the same person, and looking at the difference. Since you are looking at the feet of the same person, it matches paired data. The data you collect from one person is independent of the data you collect from a different person, so the differences are independent. Therefore you can use a paired \(t\)-test as long as you assume the differences in the data are normally distributed.
What if the situation is changed a bit?
Suppose you are trying to see if a medicated skin lotion works better than a non-medicated one. So you collect a group of \(20\) people with dry skin on their feet. For one week, half of them rub medicated skin lotion on their feet, and the other half of the group rub non-medicated skin lotion on their feet. At the end of the week, you check the dryness level of people's feet. Is this a situation in which you would use a paired \(t\)-test?
Solution
Notice that the main difference between this and the previous example is that there is no pairing going on! You really have two separate groups of subjects getting different treatments, and there is no way to pair the data in a meaningful way. So while the small sample size would indicate a \(t\)-test would be used, it would not be a paired \(t\)-test.
Paired T-Test - Key takeaways
- To do a paired \(t\)-test, you will need that you have matched pair data, the differences between the measurements are independent, and that the differences are approximately normally distributed.
- The degrees of freedom for a paired \(t\)-test are \(\upsilon =n-1\).
- In a paired experiment where \(n\) is small and \(\sigma ^2\) is unknown, if the difference between two population means, \(D\), is distributed as \(\text{N}(\mu _D, \sigma ^2)\), then\[t=\dfrac{\bar{D}-\mu _D}{\dfrac{S}{\sqrt{n}}} \sim t_{n-1}\]where \(\bar{D}\) is the mean of the differences between the two samples.
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel