Fig. 1 - There is quite a bit of variation in the size of dogs!
Definition of the t-distribution
You might be familiar with the normal distribution as a bell-shaped curve, but it is not the only bell-shaped distribution out there!
There are many others that share this shape, one of which is the \(t\)-distribution. While these two distributions are very similar, they are used in different situations.
You would use a normal distribution if you were making a confidence interval or hypothesis test where:
the populations are normally distributed and have equal variance;
the population variance is known; or
the sample size is large.
On the other hand, you would use a \(t\)-distribution if you were making a confidence interval or hypothesis test where:
Remember that if you know the population variance, or have a sufficiently large sample, for a normally distributed random variable, \(X\), where
\[\bar{X} \sim \text{N}\left(\mu, \dfrac{\sigma ^2}{n}\right)\]
you can construct a confidence interval or a hypothesis test.
In reality, you are not likely to know the actual population variance just as you don't generally know the population mean, which is often what you are testing for.
When the sample size \(n\) is large enough, you can use the sample variance \(S\) instead of the population variance \(\sigma\). In this instance, the Central Limit Theorem gives you that
\[\dfrac{\bar{X}-\mu}{\dfrac{S}{\sqrt{n}}}\]
is approximately normal, and
\[\frac{\bar{X}-\mu}{\dfrac{S}{\sqrt{n}}} \approx \text{N}(0,1^2).\]
When \(n\) is small, rather than use the normal distribution, you can use \(t\)-distribution. The value of \(t\) is given by
\[t=\frac{\bar{X}-\mu}{\dfrac{S}{\sqrt{n}}}.\]
Below you can see the graph of the standard normal distribution as compared to the \(t\)-distribution for various values of \(n\).
Fig. 2 - Standard normal distribution as compared to the \(t\)-distribution for various values of \(n\).
As you can see in the graph above, as \(n\) increases the \(t\)-distribution gets closer to the standard normal distribution. This is one of the reasons statisticians will say that a sample size of \(20\) is often sufficiently large to switch from using a \(t\)-distribution to a normal distribution.
Since the sample size has an important part to play in \(t\)-distributions, it is given a special name, as you will see in the next section.
Degrees of freedom in the t-distribution
Just like with the chi-squared distribution and \(F\)-distribution, the sample size \(n\) determines the number of degrees of freedom. The sample size tells you two things about the degrees of freedom of the \(t\)-distribution:
The number of degrees of freedom, \(\upsilon\), is determined by the sample size minus \(1\): \(\upsilon = n-1\).
As \(\upsilon \to \infty\), the \(t\)-distribution approaches \(\text{N}(0,1^2)\).
Indeed, the normal and \(t\)-distributions are pretty similar. Both are symmetrical and exhibit a bell-curve shape, and they have the same end behaviour.
To indicate you are using a specific degree of freedom for a \(t\)-distribution you can write \(t_\upsilon\)-distribution.
The t-distribution formula
The following is the formula you'll need for the \(t\)-distribution.
If a random sample \(X_1,X_2,X_3, \dots,X_n\) is selected from a normal distribution with an unknown variance \(\sigma ^2\), then
\[t=\dfrac{\bar{X}-\mu}{\dfrac{S}{\sqrt{n}}}\]
where \(t\) is a \(t_{n-1}\)-distribution and \(S^2\) is an unbiased estimator of \(\sigma^2\).
For a reminder of what it means to be unbiased, see the article Estimator Bias.
Just like with the standard normal distribution, there are tables of values you can use with the \(t\)-distribution.
Tables for the t-distribution
The table below is a section of a \(t\)-distribution probability table.
Table 1. \(t\)-distribution probability table
\(\upsilon\) | \(0.100\) | \(0.050\) | \(0.025\) |
\(1\) | \(3.0777\) | \(6.3138\) | \(12.7062\) |
\(2\) | \(1.8856\) | \(2.9200\) | \(4.3027\) |
\(3\) | \(1.6377\) | \(2.3534\) | \(3.1824\) |
The values in the table are that which exceed the probability along the top of the table given a certain number of degrees of freedom.
For example, suppose that \(X\) has \(3\) degrees of freedom. The number \(3.1824\) in the lower right corner of the table above means that:
Since the \(t\)-distribution is symmetric for any degrees of freedom, you also know that
The area \(P(X>3.1824 )=0.025\) for a \(t\)-distribution curve with \(3\) degrees of freedom is shaded the graph below. Remember that when \(\upsilon = 3\) the sample size is \(n=4\).
Fig. 3 - \(t_3\)-distribution with the shaded area equaling \(0.025\).
Let's take a look at an example.
Suppose \(X\) is a random variable with degrees of freedom \(\upsilon\). Find the value of \(s\) where \(P(|X|<s)=0.80\) where \(\upsilon = 3\).
Solution
Notice that \(P(|X|<s)=0.80\) is the same as \(P(|X|>s)=0.20\) because the \(t\)-distribution is symmetric. This looks a little odd, but it simply means that \(P(X<-s)=0.1\) and \(P(X>s)=0.1\). It can often help to draw a picture of what you are looking for.
Fig. 4 - The total shaded area is \(0.2\).
You can use the \(t\)-distribution table or a calculator to find that the value of \(s\) that gives you \(P(X>s)=0.1\) is \(s=1.6377 \).
Critical values for the t-distribution
Critical values are used when constructing confidence intervals. Confidence intervals depend on the confidence level, you are using. Remember that the confidence limits for a \(100(1-\alpha)\%\) always have the form
test statistic \(\pm\) (\(t\)-critical value)(standard error).
In the case of the \(t\)-distributions, the standard error is given by
\[ \text{standard error} = \frac{s}{\sqrt{n}},\]
and the \(t\)-critical value is
\[ \text{critical value} =t^*= t_{n-1}\left(\frac{\alpha}{2}\right) .\]
Suppose you have a \(t_2\)-distribution. Find the critical values for the \(90\%\), \(95\%\), and \(99\%\) confidence levels.
Solution:
For the \(90\%\) confidence level, the first goal is to find \(\alpha\). Here
\[ 90\% = 100\%(1-\alpha) \]
so
\[ 0.9 = 1 - \alpha\]
and
\[ \alpha = 0.1.\]
Then for the \(t\)-critical value,
\[\begin{align} t^*& = t_{n-1}\left(\frac{\alpha}{2}\right) \\ & = t_2\left(\frac{0.10}{2}\right) \\ &= t_2(0.05) \\&= 2.92 . \end{align}\]
Similarly, for the \(95\%\) confidence level the \(t\)-critical value is
\[\begin{align} t^*& = t_{n-1}\left(\frac{\alpha}{2}\right) \\ & = t_2\left(\frac{0.05}{2}\right) \\ &= t_2(0.025) \\&= 4.3027, \end{align} \]
and for the \(99\%\) confidence level the \(t\)-critical value is
\[\begin{align} t^*& = t_{n-1}\left(\frac{\alpha}{2}\right) \\ & = t_2\left(\frac{0.01}{2}\right) \\ &= t_2(0.005) \\&= 9.925 . \end{align}\]
Notice that as the confidence level increases the \(t\)-critical value does as well, meaning that your confidence interval gets larger. That makes sense for two main reasons:
the more confident you are in a prediction, the harder it is to guarantee you have captured the population parameter in the confidence interval; and
the \(t\)-critical value is related to the area under the \(t\)-distribution curve.
For example, at the \(80\%\) confidence level you are actually asking for \(80\%\) of the area under the curve to be captured in the shaded area. The higher your confidence level, the larger the shaded area!
Fig. 5 - \(t\)-distribution showing how confidence level relates to the area under the curve.
This is one of the reasons it can be helpful to draw a picture of what you are trying to find before you reach for a calculator or \(t\)-distribution table!
T-Distribution - Key takeaways
- If the random sample \(X_1,X_2,X_3, \dots,X_n\) is normally distributed with an unknown variance, \(\sigma ^2\), then you have \[t=\dfrac{\bar{X}-\mu}{\dfrac{S}{\sqrt{n}}}\] where \(t\) has a \(t_{n-1}\)-distribution and \(S^2\) is an unbiased estimator for \(\sigma ^2\).
- The number of degrees of freedom is determined by the sample size minus \(1\),\(\upsilon = n-1\).
- As \(\upsilon \to \infty\), the \(t\) distribution approaches \(\text{N}(0,1^2)\).
- The critical value, \(t^*\), for the \(\alpha\) confidence level can be found with the formula \[ t^*= t_{n-1}\left(\frac{\alpha}{2}\right). \]
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel