The Meaning of Inferences in Statistics
Statistics is defined as a discipline in applied mathematics concerned with the systematic study of the collection, presentation, analysis, and interpretation of data. The collection and analysis of data using different techniques and methods are called descriptive statistics. Now, after describing data with various techniques, what's next? That’s where inferences in statistics come in. Inferential statistics is the branch of statistics that deals with making the right conclusions, interpretations, and predictions from the analyzed data.
Inferences in statistics are techniques employed to examine the results of data to arrive at conclusions, interpretations, and predictions. Inferences in statistics are also referred to as inferential statistics or statistical inference.
Inferences in statistics can help you make predictions and conclusions about the populations you are looking at by interpreting the results of random samples from that population. The two main applications of inferential statistics that help us to draw these conclusions are hypothesis testing and confidence intervals of the data.
Statistical inferences are dependent on three main components:
In general, you need a way to talk about the difference between the entire group you are looking at and the specific people who answered a survey or were part of a study.
The population refers to a group of units (persons, objects, or other items) enumerated in a census or from which a sample is drawn.
You can look at various sub-groups of the population. These sub-groups are referred to as samples.
A sample is defined as a subset of a population selected for measurement, observation, or questioning, to provide statistical information about the population.
To conduct statistical inference, the following conditions must be met:
The data for the experiment should be obtained through random samples or randomized experiments
The distribution of the sample means must be approximately normal
Individual observations must be independent
Methods for Inference in Statistics
There are two main methods for making inferences in statistics: hypothesis tests and confidence intervals. Hypothesis tests involve proving or disproving a statement with appropriate statistical methods. Confidence intervals involve the creation of a range or boundary within which the value of a parameter is expected to be found and with a degree of assurance.
These are general steps that could be followed to make statistical inferences:
Plan and design your study
Collect data
Analyze data
Interpret the results
Present the results.
Let's look at a quick example of these steps.
There are fifty states in the United States, and overall, a population of more than 300 million people. Let's say the government wants to determine the average age of the population to gain insights into changing population conditions and social and economic trends.
Planning the study: They definitely cannot go door to door asking the age of every single person in the United States! However, they can use more strategic ways and statistical inferences to arrive at values and facts very close to or equal to that of the population, and this will form the plan and strategy of the study. One of the things, the government would need to be concerned with would be sources of bias in surveys to ensure the accuracy of the data.
Collecting Data: This can involve looking at census data or taking a random sample of people in the United States and asking the ages of people in their families. Take a look at the articles Random Sampling and Survey Sampling Methods for more information.
Analyzing the Data: This is looking for the average (or mean) of a population, so the appropriate analysis of the data would be hypothesis tests for a population mean.
Interpret the Results: This step is especially important! Often you will see things like "the approval rating is \(54 \% \pm 3\%\), meaning that the rating isn't exactly \(54\%\), but they can say with some degree of certainty that it is within \(3\%\) of \(54\%\). Being able to justify the claims you make is a big part of inference in statistics.
Presenting the Results: Once the average age is determined, it needs to be presented in such a way that other people (newscasters, bloggers, etc.) can understand it and explain it to other people.
Types of Inferences in Statistics
Inferences in statistics can be done in several ways, with one of the most frequently seen being hypothesis testing.
A hypothesis is an assumption taken to be true for argument or investigation. An example of a hypothesis would be that the president's approval rating has declined since last year.
Hypothesis testing refers to the process of testing these assumptions and drawing conclusions about parameters from a sample regarding the population. It is done to assess the credibility of a certain hypothesis using data from a sample.
You can look at the article Hypothesis Testing for further information on what a hypothesis really is and how the testing is done.
Another method used in inferences is making and using confidence intervals. A confidence interval is used to generate a range of values where you can conclude with reasonable certainty that the real value lies. You might have seen this in political commentary when someone says something along the lines of "the candidate is leading by \(18\) points, plus or minus \(2\%\)". That would mean that they have constructed a confidence interval for the lead of the candidate, and it is within \(2\%\), lower or higher than \(18\%\). Depending on what you are measuring, you would do one of the following kinds of intervals:
- Confidence Intervals for a Population Proportion
- Confidence Intervals for a Population Mean
- Confidence Intervals for the Difference of Two Proportions
- Confidence Intervals for the Slope of a Regression Model
- Confidence Intervals for the Difference of Two Means
As you have already seen in the article Data Analysis, sometimes the data collected isn't numerical. It could be categorical, such as in surveys. If you would like to draw inferences from categorical data, then you will generally use the Chi-Square Distribution. For more information on this kind of inference, see the article Inference for Distributions of Categorical Data.
Causal Inference in Statistics Solutions
What do you think of when you hear the term causal inference?
Causal inference is the process of concluding that a particular treatment given to the independent variable was the cause of the effect observed in the dependent variable.
An academic field known as causal inference examines the presumptions, research plans, and estimating techniques that enable researchers to infer causal relationships from data. Here, the treatment given to the independent variable is known as the intervention, while the effect observed in the dependent variable is the outcome.
Causal inference is when one deduces that something is or is most likely to be the cause of another. For instance, one may assume that someone is (or was) playing piano based on the sound of piano music.
However, a correlation may be misunderstood for causality. When certain variables show a relationship or association, this should not be mistaken that one directly affects the occurrence of the other as there may be a third variable. For instance, because cucumbers and tomatoes both have higher production in one year does not mean that the yield of tomatoes and cucumbers are related.
They are both associated with another variable which is climate.
Nonetheless, if a reduction in one variable leads to a proportionate increase or decrease in the other variable, then one can agree that a cause-and-effect relationship does occur between both variables. There are ways to design experiments so that as many outside effects as possible are eliminated. For more information on these techniques, see Experiment Methods, Sources of Bias in Experiments, and Randomized Block Design.
Power Function in Statistics Inference
A power function describes the true value of a parameter to the probability of rejecting a null hypothesis about the value of that parameter. See the article Errors in Hypothesis testing for more information on the types of errors in hypothesis tests and what can cause them.
Examples of Inference in Statistics
Let's take a look at an example of inference in statistics.
Suppose you are interested in seeing if there is a relationship between the number of hours of sleep someone gets and how good their grades are. To answer the question, you select random people in your class (this would be your sample) and ask them how many hours of sleep they get in a night and what their grade in the class is. You can then use this sample of the whole class (the whole class is the population) to make a hypothesis about the number of hours of sleep and the relationship this has to grades and do a hypothesis test to check the results. From there, you can make an inference about the population based on your sample.
Let's look at another example.
A drug manufacturer has a new product that they hope will cure cancer that they want to test. Naturally, they first start by testing it on mice rather than people. They select a group of mice with cancer and the second group of mice without cancer. Some of each group get the new product (these are the treatment groups), and some do not (these are the control groups). They can then measure how the drug affects the mice who get the drug and compare it to how the mice who didn't get the drug do.
This is an example of conducting an experiment, and the manufacturer would need to do hypothesis testing with two samples to see if their drug is effective. From there, they can draw an inference and decide if they want to continue pursing development of the drug.
Inferences in Statistics - Key takeaways
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel