Confidence Interval for Slope of Regression Line

Mobile Features AB

With what confidence would you say that the relationship between the hours of sleep you get at night and your success in school are related? And that this relationship is a linear relationship?

Get started

Millions of flashcards designed to help you ace your studies

Sign up for free

Achieve better grades quicker with Premium

PREMIUM
Karteikarten Spaced Repetition Lernsets AI-Tools Probeklausuren Lernplan Erklärungen Karteikarten Spaced Repetition Lernsets AI-Tools Probeklausuren Lernplan Erklärungen
Kostenlos testen

Geld-zurück-Garantie, wenn du durch die Prüfung fällst

Review generated flashcards

Sign up for free
You have reached the daily AI limit

Start learning or create your own AI flashcards

StudySmarter Editorial Team

Team Confidence Interval for Slope of Regression Line Teachers

  • 12 minutes reading time
  • Checked by StudySmarter Editorial Team
Save Article Save Article
Sign up for free to save, edit & create flashcards.
Save Article Save Article
  • Fact Checked Content
  • Last Updated: 06.01.2023
  • 12 min reading time
Contents
Contents
  • Fact Checked Content
  • Last Updated: 06.01.2023
  • 12 min reading time
  • Content creation process designed by
    Lily Hulatt Avatar
  • Content cross-checked by
    Gabriel Freitas Avatar
  • Content quality checked by
    Gabriel Freitas Avatar
Sign up for free to save, edit & create flashcards.
Save Article Save Article

Jump to a key chapter

    In this article, you will learn about a confidence interval for the slope of a regression model, its meaning, the conditions necessary to be able to construct them, the formula, and how to actually determine them. For information on drawing conclusions about a population from the confidence interval, see the article Justifying Claims Based on the Confidence Interval for the Slope of a Regression Model.

    Meaning of Confidence Interval for Slope of Regression Line

    By now you know that when there is a linear relationship between a variable \(x\) and a variable \(y\) – the linear correlation coefficient \(r\) is non-zero – you can model it with a linear regression. This regression consists of:

    \[\hat{y}=\beta_0+\beta_1x\]

    where:

    • \(\beta_0\) is the y-intercept;

    • \(\beta_1\) is the slope of the regression;

    • \(x\) is the independent variable; and

    • \(\hat{y}\) the predicted value of the dependent variable.

    For a better reminder of this topic, see our article Least-Squares Regression. Remember that the correlation coefficient \(r\) tells you how much of a correlation there is between the two variables. If \(r\) is close to zero, then there is little to no correlation between the variables, while \(r\) values close to \(-1\) or \(1\) indicate that there is a strong correlation between the two variables.

    On the other hand, the slope \(\beta_1\) represents how much \(\hat{y}\) changes to the changes in the \(x\)-values, that is, for each unit of increase of \(x\), \(\hat{y}\) increases \(\beta_1\) units.

    Suppose you suspect that an increase in book price means that fewer books will be sold. You collect data, and find the line of best fit to be:

    \[\hat{y}=3500-10x\]

    where \(x\) is the price is the book and \(hat{y}\) is the predicted number of books sold. What a \(\$1\) increase in \(x\) mean about the number of books you predict will sell?

    Solution:

    From the equation given you can see that \(\beta_0 = 3500\) and \(\beta_1 = -10\). Notice that the slope of the regression model is negative. That means an increase of \(\$1\) in the book price corresponds to a predicted increase of \(-10\) books sold, or in other words you can predict that 10 fewer books will be sold for every dollar increase in book price.

    By calculating a confidence interval with a high confidence level, say \(c\%\), for the slope \(\beta_1\), you get two values that define the limits of a range of values in which you can find the slope. You can say with \(c\%\) confidence that the value of the slope will be between those two values.

    Furthermore, you can say that the method used to construct the interval is successful in capturing the actual slope of the linear regression model about \(c\%\) of the time.

    Conditions for Confidence Interval for the Slope of a Regression Line

    The conditions for constructing a confidence interval for the slope of a linear regression are the same as for constructing a linear regression. These conditions are:

    1. Quantitative variable condition: Correlation only applies if both variables are quantitative.

    2. Straight enough condition: Look at the scatter plot and make sure your data has an approximately linear relationship. Correlation only measures the strength in a linear association. This can also be done by looking at the correlation coefficient of the data.

    3. Independence of Variables: Data should be collected randomly, and if sampling without replacement is done, the sample size is less than or equal to \(10\%\) of the total population.

    4. Normal: The independent variable is normally distributed.

    Formula of Confidence Interval for Slope of Regression Line

    Like any confidence interval you have studied so far, a confidence interval for the slope \(\beta_1\) of the least squares regression line has the following structure:

    sample statistic – margin of error \(\le \beta_1\le\) sample statistic + margin of error,

    where margin of error = critical value \(\times\) standard error.

    Now, you just have to understand what each of those three elements is for the slope \(\beta_1\):

    • The sample statistic will be \(\hat{\beta}_1\), the point estimator of the slope \(\beta_1\);

    • For the margin of error:

      • this time the critical value will be of a \(t\)-distribution with \(n-2\) degrees of freedom, i.e., \(t\) with \(df=n-2\);

      • the standard error for the slope, written \(SE_{\beta_1} \), will be:\[SE_{\beta_1}=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}}\]where \(s\) is the sample standard deviation calculated as:\[s={\sqrt{\frac{\sum_{i=1}^{n}(y_i-\hat{y}_i)^2}{n-2}}}\ \]

    The reason why you'll be using a critical \(t\) value instead of a critical \(z\) value is that the standard error of the slope \(\hat{\beta}_1\) is an estimate. You might not actually know the standard deviation of the sampling distribution.

    Thus, the formula for a confidence interval for the slope \(\beta_1\) is:

    \[\hat{\beta}_1- t\cdot SE_{\beta_1}\le \beta_1\le \hat{\beta}_1+ t\cdot SE_{\beta_1}\]

    or an even shorter version:

    \[\hat{\beta}_1\pm t\cdot SE_{\beta_1}\\]

    This confidence interval is for any confidence level, but confidence levels that you will see most often are \(90\%\), \(95\%\), and \(99\%\). These are the values you should consider when calculating the critical value \(t\).

    Calculations for Confidence Interval for Slope of Regression Line

    From what you have read so far, the formula for a confidence interval for the slope suggests a set of steps you should follow when you want to find it.

    Step 1: Find the sample statistic \(\hat{\beta}_1\).

    You get the value of the point estimator \(\hat{\beta}_1\) by constructing the regression line for the data set you are working with.

    Step 2: Select a confidence level \(c\%\).

    The confidence level describes the uncertainty of a sampling method. You will most often be asked for a confidence level of \(90\%\), \(95\%\), or \(99\%\).

    The purpose of knowing the confidence level is to be able to find the critical value \(t\), by consulting a \(t\) table, with two bits of information:

    1. the degrees of freedom, given by the:\[ \text{sample size } -2 = n-2\]where \(n\) is the sample size; and

    2. the confidence level adjusted for the table you are using.

    Depending on the table you consult, the confidence level may have to be adjusted to \(1-\tfrac{\alpha}{2}\) or to \(\tfrac{\alpha}{2} \).

    For example, for a confidence level of \(99\%\), you know that \(c=100(1-\alpha)\%\) and so:

    \[\begin{align} 99\%&=100\%(1-\alpha) \\ 0.99&=1-\alpha \\ \alpha&=0.01 .\end{align}\]

    Now, depending on the table you consult, you'll do:

    \[1-\frac{\alpha}{2}=1-\frac{0.01}{2}=0.995\]

    or

    \[\frac{\alpha}{2} = \frac{0.01}{2}=0.005\]

    Step 3: Find the margin of error \(t\cdot SE_{\beta_1}\).

    As you already know, the margin of error is the product of the critical value \(t\) with the value of the standard error. The formula for the standard error is:

    \[SE_{\beta_1}=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}}\]

    where \(s\) is the sample standard deviation.

    Step 4: Find the confidence interval.

    Here you just have to replace the values you got in the previous step in the formula:

    \[\hat{\beta}_1\pm t\cdot SE_{\beta_1}\\]

    Let's look at an example where you can apply the steps by hand.

    Given that the data set in the table below

    xy
    13
    24
    27
    38
    59

    Table 1. Example data.

    find a confidence interval of \(95\%\) for the slope knowing that the least squares regression line of this data is:

    \[\hat{y}=2.41+1.46x\]

    the sample variance is \(s^2=2.39\) and \(t=3.182\).

    Solution:

    Step 1: Find the sample statistic \(\hat{\beta}_1\)

    You were given the equation of the regression line, so you know that \(\hat{\beta}_1=1.46\).

    Step 2: Select a confidence level \(c\%\)

    The confidence level is given: \(c=95\%\). You’re also given the critical value \(t=3.182\).

    If you had to consult a \(t\) table, you would first see that \(df=5-2=3\), second that \(95\%=100\%(1-\alpha)\) if and only if \(0.95=1-\alpha\) if and only if \(\alpha=0.05\), and then that \(1-\alpha/2=1-0.05/2=0.975\).

    Step 3: Find the margin of error \(t\cdot SE_{\beta_1}\).

    You know that:

    \[SE_{\beta_1}=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}}\\]

    You know \(s^2=2.39\), so the sample standard deviation is \(s=1.55\).

    For the sum in the denominator, you first need the sample mean of the \(x-\)values.

    \[\bar{x}=\frac{1+2+2+3+5}{5}=2.6\]

    Now the sum:

    \[\begin{align} \sum_{i=1}^{n}(x_i-\bar{x})^2=&(1-2.6)^2+(2-2.6)^2+(2-2.6)^2+\\&+(3-2.6)^2+(5-2.6)^2 \\ &=9.2 \end{align}\]

    Finally, for the margin of error:

    \[\begin{align} t\cdot SE_{\beta_1}&=3.182\left( \frac{1.55}{\sqrt{9.2}}\right)\\ &=3.182(0.51)\\ &=1.62282. \end{align} \]

    Step 4: Find the confidence interval

    Now just substitute the values you determined in the previous steps into the formula:

    \[\hat{\beta}_1\pm t\cdot SE_{\beta_1}= 1.46\pm 1.62282\]

    which gives you

    \[ -0.16282\le \beta_1 \le 3.08282\ \]

    If you have satisfied the conditions for doing a confidence interval for the slope of a regression model, you can say with \(95\%\) confidence that the true value of the slope \(\beta_1\) is between \(-0.16282\) and \(3.08282\).

    Example of Confidence Interval for Slope of Regression Line

    Let's look at an example of doing the calculations necessary for finding the confidence interval for the slope of a regression line.

    Between \(2010\) and \(2022\), data was collected on the average cost of college textbooks required for a semester that year. That data is in the table below. Find the confidence interval for the slope of the regression line at a \(99\%\) confidence level.

    YearAverage Book Cost (in \($\))YearAverage Book Cost (in \($\))
    \(2010\)\(660\)\(2017\)\(1125\)
    \(2011\)\(678\)\(2018\)\(1100\)
    \(2012\)\(596\)\(2019\)\(1300\)
    \(2013\)\(550\)\(2020\)\(1320\)
    \(2014\)\(770\)\(2021\)\(1369\)
    \(2015\)\(790\)\(2022\)\(1400\)
    \(2016\)\(860\)

    Table 2. Data sample.

    Solution:

    First, draw a scatter plot of the data.

    Confidence Intervals for the Slope of a Regression Model scatter plot of average book cost vs. year showing an approximately linear relationship which is increasing StudySmarter

    It certainly looks reasonable to consider a linear regression model, and there are no obvious outliers. Assume year \(2010\) corresponds to \(x=1\). You can find the correlation coefficient \(r = 0.96\) and the line of best fit \(\hat{y} = 79.9x+ 458.1\). With the correlation coefficient being close to \(1\) you can see there is a strong linear relationship between the year and the average book cost.

    For a reminder of how to find the correlation coefficient and the line of best fit see Linear Regression and Least-Squares Regression

    In fact if you graph the line of best fit you can see immediately that there is a strong linear relationship.

    Confidence Intervals for the Slope of a Regression Model scatter plot of average book cost vs year with the line of best fit StudySmarter

    Now let's follow the steps to find the confidence interval for the slope of the regression line.

    Step 1: Find the sample statistic \(\hat{\beta}_1\).

    The line of best fit is \(\hat{y} = 79.9x + 458.1\), so \(\beta_1 = 79.9\). This is the point estimator for the data.

    Step 2: Select a confidence level \(c\%\).

    The confidence level for this problem is \(99\%\). There are \(13\) samples, which means the degree of freedom is \(13-2=11\). Consulting a \(t\)-table then gives the \(t\) critical value as \(3.11\), so \(t = 3.11\).

    Step 3: Find the margin of error \(t\cdot SE_{\beta_1}\).

    To do this you first need to calculate \(s^2\). Given the equation for the line:

    \[ y_i-\hat{y}_i = y_i - (79.9x_i - 458.1 ) \]

    To make the calculations for \(s\) a little easier to follow it can help to make a table.

    \(x_i\)\(y_i\)\(\hat{y}_i\)\((y_i-\hat{y}_i )^2 \)
    16605383844
    2678617.93612.01
    3596697.810363.24
    4550777.751847.29
    5770857.624837.76
    6790937.521756.25
    78601017.424774.76
    811251097.3767.29
    911001177.25959.84
    1013001257.11840.41
    1113201337289
    1213691416.92294.41
    1314001496.89370.24

    Table 3. Data sample.

    Using the formula and the information in the table above:

    \[\begin{align} s &=\sqrt{\frac{\sum_{i=1}^{n}(y_i-\hat{y}_i)^2}{n-2}} \\ &= \sqrt{\frac{\sum_{i=1}^{13}(y_i-\hat{y}_i)^2}{11}} \\ &= \sqrt{\frac{161556.5 }{11}} \\ &\approx 121.2 \end{align}\]

    Then you have:

    \[\begin{align} SE_{\beta_1}&=\frac{s}{\sqrt{\sum_{i=1}^{n}(x_i-\bar{x})^2}} \\ &= \frac{121.2}{182} \\ &\approx 0.67 \end{align} \]

    You have already found the critical value \(t = 3.11\), so:

    \[ \begin{align} \text{margin of error} &= t\cdot SE_{\beta_1} \\ &= (3.11)(0.67 ) \\ &\approx 2.08 \end{align}\]

    Step 4: Find the confidence interval

    Substituting the values you found in the previous steps into the formula:

    \[\hat{\beta}_1\pm t\cdot SE_{\beta_1}= 79.9\pm 2.08\]

    which gives you a confidence interval of \( (77.82, 79.98) \).

    If you have satisfied the conditions for doing a confidence interval for the slope of a regression model, you can say with \(99\%\) confidence that the true value of the slope \(\beta_1\) is between \(77.82 \) and \(79.98 \).

    Confidence Intervals for the Slope of a Regression Model – Key takeaways

    • By calculating a confidence interval with a high confidence level, say \(c\%\), for the slope \(\beta_1\), you get two values that define the limits of a range of values in which you can find the slope. You can say with \(c\%\) confidence that the value of the slope will be between those two values.
    • You can say that the method used to construct the interval is successful in capturing the actual slope of the linear regression model about \(c\%\) of the time.
    • The formula for the confidence interval for the slope of a regression model is \[\hat{\beta}_1\pm t\cdot SE_{\beta_1}\, ,\] where
      • \(\hat{\beta}_1\) is the estimate of the slope \(\beta_1\)
      • \(t\cdot SE_{\beta_1}\) is the margin of error
      • \(t\) is the critical value from the \(t-\)distribution with parameter \(df=n-2\) (\(n-2\) degrees of freedom)
      • \(SE_{\beta_1}\) is the standard error for the slope
    Frequently Asked Questions about Confidence Interval for Slope of Regression Line

    How to interpret confidence interval for slope of regression line?

    c% of the time, the estimated slope β1* is going to overlap with the true value of the slope βthat you’re estimating.

    What is the confidence interval for the slope of a regression line?

    It is a range of values in which you have c% confidence that the estimated value of the slope, β1*, is in that range.

    What is an example of a confidence interval for the slope of a regression line?

    For a small data set like

    x  1  2  2  3  5

    y  3  4  7  8  9

    the confidence interval for the slope is 

    -0.16282 ≤ β1 ≤ 3.08282

    How to calculate the confidence interval for the slope of a regression line?

    To calculate the confidence interval for the slope, follow these steps:

       Step 1: Find the slope estimate, β1*

       Step 2: Select a confidence level c%

       Step 3: Find the margin of error t×SEβ1

       Step 4: Find the confidence interval

    What is the formula for the confidence interval for the slope of a regression line?

    The formula is β1* ± t×SEβ1, where β1* is the slope estimate, t is the critical value, and SEβ1 is the standard error of the slope.

    Save Article

    Test your knowledge with multiple choice flashcards

    The expression \(t\cdot SE_{\beta_1}\) is known as ____.

    The margin of error is the product of two components, which are ____ and ___.

    What are the confidence intervals most often used?

    Next
    How we ensure our content is accurate and trustworthy?

    At StudySmarter, we have created a learning platform that serves millions of students. Meet the people who work hard to deliver fact based content as well as making sure it is verified.

    Content Creation Process:
    Lily Hulatt Avatar

    Lily Hulatt

    Digital Content Specialist

    Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.

    Get to know Lily
    Content Quality Monitored by:
    Gabriel Freitas Avatar

    Gabriel Freitas

    AI Engineer

    Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.

    Get to know Gabriel

    Discover learning materials with the free StudySmarter app

    Sign up for free
    1
    About StudySmarter

    StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.

    Learn more
    StudySmarter Editorial Team

    Team Math Teachers

    • 12 minutes reading time
    • Checked by StudySmarter Editorial Team
    Save Explanation Save Explanation

    Study anywhere. Anytime.Across all devices.

    Sign-up for free

    Sign up to highlight and take notes. It’s 100% free.

    Join over 22 million students in learning with our StudySmarter App

    The first learning app that truly has everything you need to ace your exams in one place

    • Flashcards & Quizzes
    • AI Study Assistant
    • Study Planner
    • Mock-Exams
    • Smart Note-Taking
    Join over 22 million students in learning with our StudySmarter App
    Sign up with Email