What is the Meaning of Statistical Graphs?
Data is usually gathered as numbers, words, or characters, which can be organized in tables according to the context. But taking a look at a massive table does not tell you much, you would have to pay close attention to each inquiry. Maybe you will even need to do some calculations for comparing two inquiries! This is impractical.
One way of having a clearer understanding of what the data is telling you is by organizing it into statistical graphs.
A statistical graph is a graph that organizes data, allowing a clearer visualization.
This definition is rather general, as there are many ways of organizing data, so there are a lot of different statistical graphs that you can use. Depending on the context, you might want to choose one over another to display your data.
Here, you can take a look at the different types of statistical graphs, so you can pick the one that better fits your needs for data display!
Importance of Statistical Graphs
Before talking about the different types of statistical graphs, you need to understand why is it important to display data in statistical graphs. There are three main advantages that you can gain from an adequate display of your data:
- Raw data might contain hidden patterns and relationships that you cannot identify by just looking at the raw data. These will be revealed using a picture.
- A display of data will help you identify the most significant features of your data.
- You will be able to communicate the data in a simpler way.
Whenever you are given the chance of displaying data using a graph, take it. Most statistical software nowadays can display and organize data in an easy and straightforward way.
Displaying Categorical Data
Begin by recalling what categorical data is about.
Categorical data is data whose properties are described or labeled.
Some examples of categorical data are things like flavor, color, race, zip codes, names, and so on.
Within the context of statistical graphs, whenever you are dealing with categorical data, you will be counting how many inquiries fall within each category. This number you count is known as frequency, and whenever you are going to display categorical data, you first need to get your hands on a frequency table.
A frequency table is a record of the different categories (or values) along with their frequency.
Frequency tables can be used for either categorical or quantitative data.
Here is an example that will be used as a starting point for the different types of statistical graphs.
Two of your friends are excellent cooks, so they decide to start up a business to make some extra money during summer. They decide to sell artisan ice cream, but since they will be working in a small kitchen, they will not be able to sell a wide variety of ice cream flavors.
To decide which flavors they should focus on, you run a survey around your neighborhood asking for favorite ice cream flavors. You organize data into the following frequency table.
Flavor | Frequency |
Chocolate | \(15\) |
Vanilla | \(14\) |
Strawberry | \(9\) |
Mint-Chocolate | \(3\) |
Cookie Dough | \(9\) |
Table 1. ice cream flavors, statistical graphs.
As you are going back with your friends to communicate your findings, you realize they might be tired because of the kitchen set-up. Because of this, you first decide to make a friendlier display of data, so they do not have to look at raw numbers.
It is time to see what options you have for displaying your ice cream flavor survey.
Bar Charts
Bar charts are pretty straightforward. You line up the different categories of your survey and draw the bars depending on the frequency of each categorical variable. The higher the frequency, the taller the bar.
There are two ways of drawing bar charts: Using vertical bars and using horizontal bars.
The most common type of bar charts are those that use vertical bars. To draw a vertical bar chart, you first need to write the different categories on the horizontal axis and then the range of frequencies on the vertical axis. For your ice cream flavors example, this will look like this:
Figure 1. Empty bar chart
Next, you will need to draw bars whose height goes all the way up to the frequency of each variable. Usually, different colors are used, and the width of the bars is chosen such that the bars are not adjacent to each other.
Figure 2. Vertical bar chart of the favorite flavors of ice cream of your neighbors
To draw a horizontal bar chart you follow the same idea, but now the variables are aligned vertically, while the frequencies are aligned horizontally.
Figure 3. Horizontal bar chart of the favorite flavors of ice cream of your neighbors
Pie Charts
Pie charts are a very common way of displaying data. They picture the whole population as a circle, which is segmented into the different categories of your survey. The bigger the frequency of a category, the bigger the portion of the circle.
Because pie charts divide a circle into sectors, they are also known as sector charts.
To make a pie chart, you will need to do a relative frequency table, which is the same frequency table but with a column that shows the relative frequency of each category.
You can find the relative frequency by dividing the respective frequency by the total of inquiries (which is equal to the sum of all the frequencies).
To find the relative frequency of the chocolate flavor, you first need to note that your survey consists of \(50\) inquiries. Then, you need to divide the frequency of the chocolate flavor by this number, that is
\[ \frac{15}{50} = 0.3\]
Usually, you will need to write this as a percentage, so multiply it by \(100\). This means that the relative frequency is \(30 \%\).
This relative frequency corresponds to the percentage of the population that falls within each category. Here is a table with the relative frequency of the rest of the ice cream flavors.
Flavor | Frequency | Relative Frequency |
Chocolate | \[15\] | \[30 \% \] |
Vanilla | \[14\] | \[28 \% \] |
Strawberry | \[9\] | \[ 18 \% \] |
Mint-Chocolate | \[3\] | \[ 6 \% \] |
Cookie Dough | \[9\] | \[ 18 \% \] |
Table 2. ice cream flavors, statistical graphs.
Be sure that the relative frequencies add up to \( 100 \% \).
Now that you know the relative frequencies of each category, you can proceed to draw the pie chart. Remember that the relative frequency tells you the percentage of the circle of each category.
Figure 4. Pie chart of the favorite flavors of ice cream of your neighbors
Segmented Bar Charts
Segmented bar charts are practically a hybrid between a bar chart and a pie chart, closer to a pie chart. Instead of using a circle and dividing it into sectors, you divide a big bar into segments, where each segment represents a category.
Segmented bar charts are typically used when needing to compare two or more data sets. In the ice cream example, suppose you want to expand your survey to the next neighborhood, this way you can have a better picture of which ice cream flavors your friends should focus on. Here is a table of the survey on neighborhood \(B\).
Flavor | Frequency | Relative Frequency |
Chocolate | \[16\] | \[32 \%\] |
Vanilla | \[12\] | \[ 24\%\] |
Strawberry | \[7\] | \[ 14\%\] |
Mint-Chocolate | \[5\] | \[ 10\%\] |
Cookie Dough | \[10\] | \[ 20\%\] |
Table 3. ice cream flavors, statistical graphs.
Since the goal of segmented bar charts is to compare two data sets, a table with the relative frequency of both neighborhoods will be very useful.
Flavor | Relative Frequency \(A\) | Relative Frequency \(B\) |
Chocolate | \[30 \%\] | \[32 \%\] |
Vanilla | \[28 \%\] | \[24 \%\] |
Strawberry | \[18 \%\] | \[14 \%\] |
Mint-Chocolate | \[6 \%\] | \[10 \%\] |
Cookie Dough | \[18 \%\] | \[20 \%\] |
Table 4. ice cream flavors, statistical graphs.
You can now draw the segmented bar chart. Usually, the two data sets are put next to each other for means of comparison.
Figure 5. Segmented bar chart of the favorite flavors of ice cream of two neighborhoods
Segmented bar charts usually display the relative frequency of the data, so you will also need a table with relative frequencies to draw a segmented bar chart. You can also use segmented bar charts to represent the actual frequencies of your data, you just need to make sure that you use an adequate scale.
If the two data sets are obtained from a different number of inquiries, you should probably stick to relative frequencies. This way both data sets will remain on the same scale.
Displaying Quantitative Data
It is time to see what quantitative data is about.
Quantitative data is data that can be measured or counted.
Some examples of categorical data are things like age, height, weight, length, volume, and so on.
For quantitative data, it would be unpractical to display each possible value using, for example, a histogram. Suppose you are measuring the heights of your classmates. These values will typically vary from \(64\) to about \(74\) inches (more or less). But since this is measurable data, you will deal with plenty of values, so you would need to include many bars to represent this!
Instead, you can work with ranges, that is, you can take into account people whose heights are between \(64\) and \(66\) inches and let them fall into the same place.
A typical quantitative variable is a height.
Suppose you want to do a survey about the heights of your classmates. To make things easier for you, they all line up from shortest to tallest. You write down the following values, in inches:
\[ \begin{align} & 64, 65, 65, 65, 66, 66, 66, 66, 66, 66, 66, 67, 67, 67, \\ &67, 67, 67, 68, 68, 68, 68,69, 69, 69, 70, 70, 71, 72.\end{align}\]
You will use these values to address the different displays of quantitative data.
Histogram
A histogram is mostly like a bar chart. Both use bars! The difference is that the bars of the histogram are next to each other, and usually, they are all the same color.
To draw a histogram, you need to choose how to divide the range of the data. In your height example, it would be a good idea to display it in differences of \(2\) inches. You will need to add together the frequencies accordingly and make another table.
Height Range | Frequency |
\[64 \leq h < 66\] | \[4\] |
\[ 66 \leq h < 68\] | \[13\] |
\[ 68 \leq h < 70\] | \[7\] |
\[70 \leq h < 72 \] | \[3\] |
\[ 72 \leq h < 74\] | \[1\] |
Table 5. Height frequency, statistical graphs.
Just like a bar chart, the height of each bar represents the frequency of each range of data.
Figure 6. Histogram of the heights of your classmates
Dot Plots
Dot plots are another simple way of displaying quantitative data. Think of a histogram, but rather than placing bars, you place a dot for each value within the respective range. The dots stack on top of each other (or to the right if you are drawing a horizontal dot plot) and make up for an easy way of counting frequencies.
Figure 7. Dot plot of the height of your classmates
The above dot plot is drawn vertically, but please be aware that you might also find them drawn horizontally.
Interpretation of Statistical Graphs
As mentioned before, statistical graphs are useful because you can interpret the data depending on how it is distributed. Take for instance the segmented bar chart of the favorite flavors of ice cream of your neighbors.
Figure 8. Segmented bar chart of the favorite flavors of ice cream of two neighborhoods
From here you can easily see that independently of which of the two neighborhoods you are in, the most popular ice cream flavors are chocolate, vanilla, and strawberry. This suggests that your friends should work first on getting a good recipe for those flavors!
Now consider the histogram of your classmate's heights.
Figure 9. Histogram of the heights of your classmates
You can note that most of your classmates are between \(66\) and \( 68\) inches tall, while there are just a few that are much taller or shorter. This suggests that most of the data is clustered around the mean with just a few outliers, which is a central topic in statistics.
For more information about this, check out our article about Normal Distribution!
More Examples of Statistical Graphs
Here you can take a look at more examples of statistical graphs. Let's start with descriptive data.
While you were asking about the heights of your classmates you also thought about asking about their favorite sport. Here are the results of that survey.
Favorite Sport | Frequency |
Football | \[7\] |
Soccer | \[5\] |
Basketball | \[10\] |
Baseball | \[6\] |
Other | \[2\] |
Table 6. Favourite sport and frequency, statistical graphs.
You now need a nice way of displaying this data.
- Make a bar chart of the data.
- Make a pie chart of the data.
Solutions:
a. To make a bar chart you just need to draw a bar for each category you have in your data. The height of each bar will correspond to the frequency of each category.
Figure 10. Bar chart of the sport preferences of your classmates
b. To make a pie chart you will need to make a relative frequency table. You can find the relative frequency of each category by dividing the respective frequency by the total of inquiries and then multiplying by \(100\).
Favorite Sport | Frequency | Relative Frequency |
Football | \[7\] | \[ 23.3 \% \] |
Soccer | \[5\] | \[ 16.7 \%\ \] |
Basketball | \[10\] | \[ 33.3 \% \] |
Baseball | \[6\] | \[ 20.0 \% \] |
Other | \[2\] | \[6.7 \% \] |
Table 7. Favourite sport, frequency and relative frequency, statistical graphs.
This way you can know how big are the slices of the pie! Here is the graph.
Figure 11. Pie chart of the sport preferences of your classmates
How about some graphs displaying quantitative data?
While working in a gift shop, a friend of yours asks if you could tell him more or less how much money should he spend on a souvenir for his mother.
In order to give an adequate answer, you decide to make some statistics! You go into the database of the shop and arrange the prices of the souvenirs from cheapest to most expensive. To simplify things, the prices are rounded up to the nearest \(50\) cents.
\[ \begin{align} &0.5, 0.5, 1, 1, 1, 1.5, 2, 2, 2, 2, 2, 2, 2.5, 2.5, 3, 3, 3, 3, 3.5, \\ &4, 5, 5, 5, 5, 5, 5, 5, 5, 5.5, 6, 7, 7.5, 8.5, 9, 9.5, 10, 10, 10 \end{align}\]
- Make a histogram of this data.
- Make a dot plot of this data.
Solution:
a. To make the histogram you first need to choose an appropriate range to group the data. You can divide this into whole dollars. The first bar will represent all the souvenirs that cost less than \(1\) dollar, the second bar will be the one that pictures souvenirs that cost \(1\) dollar or more, but less than \(2\) dollars, and so on.
Figure 12. Histogram of the prices of souvenirs in a gift shop
b. This one is a simpler task because you do not need to group up the prices in ranges. Here you just need to draw a point on top of each other for each souvenir with the corresponding price.
Figure 13. Dot plot of the prices of souvenirs in a gift shop
Statistical Graphs - Key takeaways
- A statistical graph is a graph that organizes data, allowing a clearer visualization.
- Statistical graphs:
- Reveal hidden patterns and relationships that you cannot identify by just looking at the raw data.
- Identify the most significant features of your data.
- Communicate the data in a simpler way.
- Both categorical and quantitative data can be displayed using statistical graphs
- Categorical data is commonly displayed using bar charts, pie charts, and stacked bar charts.
- Quantitative data is usually displayed using histograms and dot plots.
- A bar chart consists of bars of different heights representing the categorical data of your survey. The height of the bar corresponds to the frequency of each category.
- A pie chart consists of a circle divided into sectors. The area of each sector corresponds to the relative frequency of each category.
- Stacked bar charts are used to compare two sets of categorical data. These consist of two or more bars, where each bar consists of smaller bars stacked on top of each other according to the relative frequency of each category.
- Histograms are like bar charts, but the bars are adjacent and usually all of the same color. These are used to represent quantitative data divided into ranges.
- Dot plots place dots instead of bars for each value that falls within the range. Each dot is stacked on top of the other for each value that falls within the corresponding range.