For example, we could collect data of outside temperature versus ice cream sales, or we could study height vs shoe size, these would both be examples of bivariate data. If there was a relationship showing an increase of outside temperature increased ice cream sales, then shops could use this to buy more ice cream for hotter spells during the summer.
How to represent bivariate data?
We use scatter graphs to represent bivariate data. A scatter graph of bivariate data is a two-dimensional graph with one variable on one axis, and the other variable on the other axis. We then plot the corresponding points on the graph. We can then draw a regression line (also known as a line of best fit), and look at the correlation of the data (which direction the data goes, and how close to the line of best fit the data points are).
Drawing a scatter graph
Step 1: We start by drawing a set of axis and choosing an appropriate scale for the data.Step 2 : Label the x-axis with the explanatory / independent variable (the variable that will change), and the y-axis with the response / dependent variable (the variable which we suspect will change due to the independent variable changing). Also label the graph itself, describing what the graph shows. Step 3: Plot the data points on the graph.Step 4: Draw the line of best fit, if required.
Here is a set of data relating the temperature on days in July, and the number of ice creams sold in a corner shop.
Temperature (° C) | 14 | 16 | 15 | 16 | 23 | 12 | 21 | 22 |
Ice cream sales | 16 | 18 | 14 | 19 | 43 | 12 | 24 | 26 |
In this case, the temperature is the independent variable, and ice cream sales are the dependent variable. This means that we plot temperature on the x-axis, and ice cream sales on the y-axis. The resulting graph should look as follows.
Graph of Ice cream sales against temperature - StudySmarter Originals
The following data represents the journey of a car with time and distance travelled measured starting from the beginning of the journey:
Time (in hours) | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
Distance (km) | 12 | 17 | 18 | 29 | 35 | 51 | 53 | 60 |
In this case, time is the independent variable, and distance is the dependent variable. This means that we plot time on the x-axis, and distance on the y-axis. The resulting graph should look as follows.
Graph of distance against time - StudySmarter Originals
What is the meaning of correlation and regression for bivariate data?
Correlation describes the relationship between two variables. We describe correlation on a sliding scale from -1 to 1. Anything negative is called a negative correlation, and a positive correlation corresponds to a positive number. The closer to each end of the scale the correlation is, the stronger the relationship, and the closer to zero the correlation is, the weaker the relationship. A zero correlation means there is no relationship between the two variables. Regression is when we draw a line of best fit for the data. This line of best fit minimizes the distance between the data points and this regression line. Correlation is a measure of how close the data is to our line of best fit. If we can find a strong correlation between two variables, then we can establish they have a strong relationship, meaning that there is a good probability that one variable influences the other.
Bivariate data - Key takeaways
- Bivariate data is the collection of two data sets, where each piece of data is paired with another from the other data set
- We use a scatter graph to show bivariate data.
- The correlation between bivariate data demonstrates how strong the relationship is between two variables.
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel