Combining Random Variables

Mobile Features AB

You have probably seen the tags on items you have purchased that say "inspected by".  Sometimes, like in car production, an item can be inspected by multiple people over the course of putting it together.  If you know the average time it takes for each inspector to check the car, and the standard deviation for each inspector, how do you figure out the total inspection time for a random car?  That is an application of combining random variables!

Get started

Millions of flashcards designed to help you ace your studies

Sign up for free

Achieve better grades quicker with Premium

PREMIUM
Karteikarten Spaced Repetition Lernsets AI-Tools Probeklausuren Lernplan Erklärungen Karteikarten Spaced Repetition Lernsets AI-Tools Probeklausuren Lernplan Erklärungen
Kostenlos testen

Geld-zurück-Garantie, wenn du durch die Prüfung fällst

Review generated flashcards

Sign up for free
You have reached the daily AI limit

Start learning or create your own AI flashcards

StudySmarter Editorial Team

Team Combining Random Variables Teachers

  • 18 minutes reading time
  • Checked by StudySmarter Editorial Team
Save Article Save Article
Sign up for free to save, edit & create flashcards.
Save Article Save Article
  • Fact Checked Content
  • Last Updated: 08.01.2023
  • 18 min reading time
Contents
Contents
  • Fact Checked Content
  • Last Updated: 08.01.2023
  • 18 min reading time
  • Content creation process designed by
    Lily Hulatt Avatar
  • Content cross-checked by
    Gabriel Freitas Avatar
  • Content quality checked by
    Gabriel Freitas Avatar
Sign up for free to save, edit & create flashcards.
Save Article Save Article

Jump to a key chapter

    Combining and Transforming Random Variables

    As you have already seen, many people inspect things like cars before they are sold. Each individual inspector has a mean inspection time, and a variance associated with their inspection time. The random variable in this case is the inspector, and what you are looking for is the sum of their expected inspection times.

    If multiple random events occur which are associated with an outcome, you may want to add them to form a new distribution. The new distribution in the car example would be the total inspection time for the car.

    Combining random variables means transforming two or more random variables into one.

    On the other hand, transforming random variables involves scaling and shifting them. This would happen if you were playing a game multiple times and trying to figure out how much your total wins and losses might be. See the article Transforming Random Variables for more details and examples on that.

    One very important thing to check before you combine random variables is that they are independent, or at least that it is reasonable for you to assume that they are independent.

    Suppose you have \(3\) people who inspect a cell phone before it gets shipped off from the factory. If no two people ever inspect the phone at the same time, could you combine the random variable of their inspection times to get a new random variable for the total inspection time?

    Answer:

    Because no inspector ever interacts with a cell phone at the same time as another inspector, it is reasonable to assume that their inspections do not affect each other. That would mean that their inspection times are independent, and you can combine the random variables.

    What about in the next example?

    Suppose that your first random variable is how many hours a randomly chosen person slept yesterday, and your second random variable is how many hours that same person was awake. Can you combine those random variables?

    Answer:

    No. How many hours a person is awake is dependent on how many hours they were asleep, so these are not independent random variables, and they cannot be combined.

    The notation \(T = X + Y\) can be confusing. Are you really just adding things together? Let's take a look at an example.

    Let's think about two people inspecting a cell phone, and they do separate inspections. The company keeps track of how long each person takes to do an inspection. Then you can set up:

    • \(X\) is the set of times for the first person to inspect a phone; and
    • \(Y\) is the set of times for the first person to inspect a phone.

    Rather than looking at each person inspecting a phone individually, the company wants to get an idea of the total time it takes to inspect a phone. So in this example, combining the random variables \(X\) and \(Y\) means making a random variable \(T\) with \(T = X + Y\) where you are actually adding the times in \(X\) to the times in \(Y\) to get a total time.

    Combining Random Variables histogram of inspection times for Inspector #1 StudySmarterFig. 1. Inspector #1 times, random variable \(X\)

    Combining Random Variables histogram of inspection times for Inspector #2 StudySmarterFig. 2. Inspector #2 times, random variable \(Y\)

    It can help to look at the range of times of \(T\). If the range of times in \(X\) is \(6\) minutes to \(8\) minutes, and the range of times in \(Y\) is \(4\) minutes to \(5\) minutes, then the range of \(T = X + Y\) is \(6+4 =10\) minutes to \(8+5=13\) minutes.

    Suppose the company took \(20\) measurements of each inspector, and graphed them in the histograms below.

    Combining Random Variables histogram of inspection times for combined random variable T StudySmarterFig. 3. Combined Inspection times, random variable \(T = X + Y\)

    The mean for Inspector #1 is \(7.1\) minutes, and the mean for Inspector #2 is \(4.6\) minutes. Then their times are combined into a new random distribution, \(T\), and the histogram for that data is above.

    Notice that the range in times of the histogram goes between \(10\) and \(13\) minutes. The mean for the combined histogram is \(11.7\) minutes, which is about what you would expect given the means for the individual inspections.

    How does combining random variables affect the mean?

    Combining Random Variables, the Mean

    While you can combine more than two random variables as long as they are independent, for simplicity's sake the rest of this article concentrates on combining just two of them.

    Suppose \(X\) and \(Y\) are two random variables that are independent. For the mean of \(X\) write \(\mu_X\), and for the mean of \(Y\) write \(\mu_Y\). How do you combine their means?

    • The mean of the sum of two random variables is the sum of their means. In other words, if \(T = X + Y\) then\[ \mu_T = \mu_X + \mu_Y.\]

    • If you take the difference of two random variables, then the mean of the difference is the difference of their means. So if \(T = X - Y\), then\[ \mu_T = \mu_X - \mu_Y.\]

    Just like in regular subtraction, the order makes a difference. Let's look at a couple of examples.

    Jake and Anna work in the same store, but in different departments. Jake expects to sell an average of \(5\) shirts per day and Anna expects to sell an average of \(3\). What is the total expected average number of shirts sold in the store per day?

    Answer:

    Let \(X\) be the random variable representing how many shirts Jake sells, and \(Y\) be the random variable representing Anna's sales. You would hope these are independent random variables! Call \(T\) the random variable of the total sales in the store, so \(T = X + Y\).

    From the problem statement,

    \[ \mu_X = 5 \text{ and } \mu_Y = 3.\]

    Therefore, they can expect to sell \[ \begin{align} \mu_T &= \mu_X + \mu_Y \\ &= 5 + 3 \\ &= 8, \end{align}\]

    or in other words a total of \(8\) shirts.

    What if you are asked about how many more shirts Jake would expect to sell?

    Jake and Anna work in the same store, but in different departments. Jake expects to sell an average of \(5\) shirts per day and Anna expects to sell an average of \(3\). How many more shirts can Jake expect to sell per day?

    Solution:

    Just like before, let \(X\) be the random variable representing how many shirts Jake sells, and \(Y\) be the random variable representing Anna's sales, where you are reasonably assuming they are independent. Call \(T\) the random variable of the difference between Jake and Anna's sales in the store. Then since \(T = X - Y\),

    \[ \begin{align} \mu_T &= \mu_X - \mu_Y \\ &= 5 - 3 \\ &= 2. \end{align}\]

    So Jake can expect to sell \(2\) more shirts than Anna.

    Suppose you had looked at the difference between Anna and Jake's sales instead? Then you would have found a mean of \(-2\)! That can happen, and you need to look at the actual combined distribution to figure out what it implies in real life. If you find a negative number when looking at the difference in the sales, it just implies that in general Anna sells fewer shirts than Jake does.

    Combining Random Variables, Standard Deviation

    Just like with the mean, combining the variance of two independent random variables is a matter of addition. Suppose \(X\) and \(Y\) are two random variables that are independent. For the standard deviation of \(X\) write \(\sigma_X\), and for the standard deviation of \(Y\) write \(\sigma_Y\). Then:

    • The variance of the sum of two random variables is the sum of their variances. In other words, if \(T = X + Y\) then\[ \sigma^2_T = \sigma^2_X + \sigma^2_Y.\]

    • If you take the difference of two random variables, then the variance of the difference is the sum of their variances. So if \(T = X - Y\), then\[ \sigma^2_T = \sigma^2_X + \sigma^2_Y.\]

    Wait a minute, that second part doesn't look right! Why is it that when you subtract two distributions you aren't subtracting their variances? It is because the variance is a measure of how spread apart the distribution is. So if you combine two distributions, the new one is going to have a larger spread than either of the two original ones.

    Does this imply that you can combine the standard deviation of two independent random variables with addition as well? Absolutely not! Remember that the standard deviation is the square root of the variance, and that

    \[ \sqrt{a + b} \ne \sqrt{a} + \sqrt{b}.\]

    So the standard deviations cannot be added in the same way that the variance can be.

    Let's look at an example to show how it works.

    Jake and Anna work in the same store, but in different departments. Jake expects to sell an average of \(5\) shirts per day and Anna expects to sell an average of \(3\). However, Jake has a standard deviation in his sales of \(1\) shirt, while Anna has a standard deviation of \(4\) shirts. Is the standard deviation of their combined shirt totals the same as the sum of the standard deviation of their individual totals?

    Solution:

    Setting up some variables:

    • \(X\) is the random variable of the number of shirts Jake sells;
    • \(Y\) is the random variable of the number of shirts Anna sells; and
    • \(T\) is the random variable of the number of shirts they sell combined.

    As you have already seen, \(\mu_T = 8\). What about the variance and standard deviation? From the statement of the problem, their individual standard deviations are

    \[ \sigma_X = 1 \mbox{ and } \sigma_Y = 4.\]

    Then for the variance,

    \[ \begin{align} \sigma^2_T &= \sigma^2_X + \sigma^2_Y \\ &= 1^2 + 4^2 \\ &= 17, \end{align} \]

    but

    \[ \sigma_T = \sqrt{17} \approx 4.1\]

    which is not the same as

    \[ \sigma_X + \sigma_Y = 1 + 4 = 5.\]

    In fact,

    \[ \sigma_T < \sigma_X + \sigma_Y.\]

    So while the average number of shirts they sell per day stays the same if they work together, the standard deviation of the number of shirts they sell together is smaller than if they stay separate.

    Combining Normal Random Variables

    In the examples you have looked at so far, it didn't make a difference if the random variables followed a normal distribution. The only thing that mattered is that they were independent random variables.

    When you have two independent continuous random variables, both of which follow a normal distribution, so does their sum or difference.

    Let's look at an example to illustrate this.

    Suppose you have a business where you are making and delivering pizzas, where both making and delivering the pizzas are normal distributions, with

    • making the pizza has an average time of \(18\) minutes with a standard deviation of \(1.5\) minutes; and
    • delivering the pizzas has an average time of \(25\) minutes with a standard deviation of \(8\) minutes.

    (a) What is the probability that making and delivering a pizza takes more than an hour?

    (b) What percentage of the pizzas take longer to make than to deliver?

    Solution:

    (a) In this part of the question you are looking for the total time, in other words, the sum of two normally distributed independent random variables. First, let's define the random variables:

    • \(X\) is the random variable for the time it takes to make a pizza;
    • \(Y\) is the random variable for the time it takes to deliver a pizza; and
    • \(T\) is the random variable to the total time to make and deliver a pizza.

    You are told that both of the random variables are normal, and you would expect that making the pizza and delivering the pizza are independent of each other. So \(T\) is also normally distributed, with \(T = X + Y\).

    The average time to make and deliver a pizza would be

    \[ \begin{align} \mu_T &= \mu_X + \mu_Y \\ &= 18 + 25 \\ &= 43 \, min. \end{align}\]

    Since the times are independent,

    \[ \begin{align} \sigma^2_T &= \sigma^2_X + \sigma^2_Y \\ &= 1.5^2 + 8^2 \\ &= 66.25,\end{align} \]

    so

    \[ \sigma_T = \sqrt{66.25} \approx 8.1 \, min.\]

    In other words, \(T\) is a normal distribution with mean \(43\) and standard deviation \(8.1\).

    You want to know the probability that making and delivering a pizza takes more than an hour. The graph below shows the normal distribution for the total time, and the shaded region represents the time over \(60\) minutes.

    Combining Random Variables normal distribution with shaded area in the right tail showing probability of time more than an hour StudySmarterFig. 4. Normal Distribution showing time longer than an hour

    Then the \(z\)-score associated with \(60\) minutes is

    \[ z = \frac{60-43}{8.1} = 2.099\]

    which, using a standard normal table, gives you the probability of taking more than \(60\) minutes is

    \[ P(T>60) = P(z>2.099) = 0.0179.\]

    In other words, there is only a \(1.79\%\) chance that a pizza will take longer than an hour to make and deliver!

    (b) Next you want to know the percentage of the pizzas take longer to make than to deliver. This time you want to know about the difference between \(X\) and \(Y\), so you need a new random variable, call it \(D\), to represent this. In other words \(D = X - Y\). It is still true that both \(X\) and \(Y\) are independent random variables that follow a normal distribution.

    The average time difference between making and delivering a pizza would be

    \[ \begin{align} \mu_D &= \mu_X - \mu_Y \\ &= 18 - 25 \\ &= -8 \, min. \end{align}\]

    Since the times are independent,

    \[ \begin{align} \sigma^2_D &= \sigma^2_X + \sigma^2_Y \\ &= 1.5^2 + 8^2 \\ &= 66.25,\end{align} \]

    so

    \[ \sigma_D = \sqrt{66.25} \approx 8.1 \, min.\]

    In other words, \(D\) is a normal distribution with mean \(-8\) and standard deviation \(8.1\). If a pizza takes longer to make than to deliver, what you want to find is \(P(D>0)\). In the graph below, the shaded region represents when the pizza takes longer to make than to deliver.

    Combining Random Variables normal distribution with shaded area in the right tail showing probability of time for making a pizza taking longer than time to deliver pizza StudySmarterFig. 5. Normal distribution showing time greater than 0

    Then the \(z\)-score associated with \(0\) minutes is

    \[ z = \frac{0-(-8)}{8.1} = 0.988\]

    which, using a standard normal table, gives you the probability of taking more than \(60\) minutes is

    \[ P(D>0) = P(z>0.988) = 0.1611.\]

    In other words, about \(16\%\) of the time, the pizza will take longer to make than to deliver.

    More examples are always good!

    Examples of Combining Random Variables

    Let's take a look at some more examples.

    Suppose you have two inspectors working for you. If either of them inspects an item, it takes an average of \(5.8\) minutes to do the inspection, with a standard deviation of \(8\) minutes. However, if both of them work together to inspect the same item, it takes an average of \(11.6\) minutes with a standard deviation of \(17\) minutes. Is it better for you to have the inspectors working separately or together?

    Solution:

    First, let's give the variables some names:

    • \(X\) is the variable for inspector A;
    • \(Y\) is the variable for inspector B; and
    • \(T\) is the variable for their combined times.

    Then \(T = X + Y\), so

    \[ \begin{align} \mu_T &= \mu_X + \mu_Y \\ &= 5.8 + 5.8 \\ &= 11.6 \, min. \end{align}\]

    That means it doesn't matter if they work together or separately, in either case, their average time is going to be \(11.6\) minutes.

    In order for you to look at their combined variances, you need to know that they are independent variables. So for the rest of this example, you will need to assume that two people can inspect an item at the same time without interfering with each other, making them independent variables. Then the variance is

    \[\begin{align} \sigma^2_T &= \sigma^2_X + \sigma^2_Y \\ &= 8^2 + 8^2 \\ &= 128, \end{align} \]

    and the standard deviation is

    \[ \begin{align} \sigma_T &= \sqrt{ \sigma^2_T} \\ & = \sqrt{134.8} \\ &\approx 11.3 \, min. \end{align} \]

    So when the two inspectors work separately, they have a much smaller variation in their inspection time.

    What does that mean in terms of you having them work together or separately? Given that their mean inspection time is the same either way, it pays you to choose the option that gives you the least variation in inspection times. That means you want the two inspectors working separately since when they work together their standard deviation is \(17\) minutes rather than \(11.3\) minutes when they work separately.

    Let's look at one involving toys.

    A local shop sells toy cars. The probability of selling between \(0\) and \(5\) toy cars is given in the table below.

    Number of CarsProbability
    \(0\)\(0.03\)
    \(1\)\(0.16\)
    \(2\)\(0.30\)
    \(3\)\(0.23\)
    \(4\)\(0.17\)
    \(5\)\(0.11\)

    Table 1. Probability of selling.

    Assume that the sale of toy cars is independent.(a) Find the mean and standard deviation for the number of toy cars the shop sells in a day.(b) If the shop is open \(5\) days a week, how many toy cars can the shop expect to sell, and what is the standard deviation?Solution:(a) First let's set up some variables. Here, \(X\) is the random variable representing the number of toy cars the shop sells in a day, with \(x_i\) being the number of cars sold with probability \(p_i\). So\[ \begin{align} \mu_X &= \sum\limits_{i=0}^5 x_i p_i \\ &= 0(0.03) + 1(0.16) + 2(0.30) + 3(0.23) + 4(0.17) + 5(0.11) \\ &= 2.68. \end{align}\]There are \(6\) table entries, so the variance is given by\[ \begin{align} \sigma^2_X &= \frac{ \sum\limits_{i=0}^5 (x_i - \mu_X)^2}{N} \\ &= \frac{\substack{(0-2.68)^2+(1-2.68)^2+(2-2.68)^2\\ +(3-2.68)^2+(4-2.68)^2+(5-2.68)^2 }}{6} \\ & \approx 2.95, \end{align} \] and the standard deviation is given by\[ \begin{align} \sigma_X &= \sqrt{2.95} \\ & \approx 1.72. \end{align} \](b) Hopefully sales of toy cars on any day doesn't affect car sale on another day, so you can assume that the daily number of toy cars sold is independent. In addition, the number of toy cars the shop expects to sell doesn't change on any given day. Then for a week, the shop can expect:\[\begin{align} \text{weekly expected car sale total} &= 5(\text{daily expected car sale total}) \\ &= 5(2.68) \\ &= 13.4, \end{align} \]so about \(13\) toy cars sold in a week.

    Remember that you can't just add to get the standard deviation! Instead, you must find the variance for the week, then take the square root. The variance for the weekly toy car sales is additive, so

    \[ \begin{align} \text{variance for weekly car sales} &= 5(2.95) \\ &= 14.75 \end{align}\]

    which gives you

    \[ \begin{align} \text{standard deviation for weekly car sales} &= \sqrt{14.75} \\ & \approx 3.84 . \end{align}\]

    Combining Random Variables - Key takeaways

    • Combining random variables means transforming two or more random variables into one.
    • Only combine random variables that are independent!
      • The mean of the sum of two random variables is the sum of their means. In other words, if \(T = X + Y\) then\[ \mu_T = \mu_X + \mu_Y.\]
      • If you take the difference of two random variables, then the mean of the difference is the difference of their means. So if \(T = X - Y\), then\[ \mu_T = \mu_X - \mu_Y.\]
      • The variance of the sum of two random variables is the sum of their variances. In other words, if \(T = X + Y\) then\[ \sigma^2_T = \sigma^2_X + \sigma^2_Y.\]
      • If you take the difference of two random variables, then the variance of the difference is the sum of their variances. So if \(T = X - Y\), then\[ \sigma^2_T = \sigma^2_X + \sigma^2_Y.\]
    • The sum and difference formulas do not work for the standard deviation!
    Frequently Asked Questions about Combining Random Variables

    What happens if two independent normal random variables are combined?

    This signifies that the total of two normally distributed random variables is normal, with the mean equal to the sum of the two means and the variance equal to the sum of the two variances (i.e., squaring the standard deviation is the sum of the squares of the standard deviations). Thus, you are able to estimate the overall impact/results of two experiments whose variables are normally distributed. 

    Why do we combine random variables?

    Combining random variables allows us to create new distributions. We can find the mean and standard deviation of the new distribution if we have the mean and standard deviation of the source distributions. 

    How do you combine variables that are random?

    To combine normal random variables, the following steps should be followed:


    Step 1: Give the random variables meaningful names, such as X and Y

    Step 2: Identify their means,  X and Y, and their standard deviations, X and Y.

    Step 3: Calculate their expected value sum or differences by adding or subtracting the mean values.

    Step 4: Square the standard deviations given to give you the variance, then take square root of results to obtain the combined standard deviation.

    What does Combining Random Variables means?

    Combining random variables means transforming two or more variables into one in simple terms. This means the random variables are assumed to be independent.

    Save Article

    Test your knowledge with multiple choice flashcards

    If you have two independent random variables which follow normal distributions, then the sum of their means is the same as the mean of their sums.

    The order of subtraction is not as important as the order of addition when combining two independent random variables. 

    If you combine two independent random variables by taking their difference, the difference in their variances is the same as the variance of their differences.

    Next
    How we ensure our content is accurate and trustworthy?

    At StudySmarter, we have created a learning platform that serves millions of students. Meet the people who work hard to deliver fact based content as well as making sure it is verified.

    Content Creation Process:
    Lily Hulatt Avatar

    Lily Hulatt

    Digital Content Specialist

    Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.

    Get to know Lily
    Content Quality Monitored by:
    Gabriel Freitas Avatar

    Gabriel Freitas

    AI Engineer

    Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.

    Get to know Gabriel

    Discover learning materials with the free StudySmarter app

    Sign up for free
    1
    About StudySmarter

    StudySmarter is a globally recognized educational technology company, offering a holistic learning platform designed for students of all ages and educational levels. Our platform provides learning support for a wide range of subjects, including STEM, Social Sciences, and Languages and also helps students to successfully master various tests and exams worldwide, such as GCSE, A Level, SAT, ACT, Abitur, and more. We offer an extensive library of learning materials, including interactive flashcards, comprehensive textbook solutions, and detailed explanations. The cutting-edge technology and tools we provide help students create their own learning materials. StudySmarter’s content is not only expert-verified but also regularly updated to ensure accuracy and relevance.

    Learn more
    StudySmarter Editorial Team

    Team Math Teachers

    • 18 minutes reading time
    • Checked by StudySmarter Editorial Team
    Save Explanation Save Explanation

    Study anywhere. Anytime.Across all devices.

    Sign-up for free

    Sign up to highlight and take notes. It’s 100% free.

    Join over 22 million students in learning with our StudySmarter App

    The first learning app that truly has everything you need to ace your exams in one place

    • Flashcards & Quizzes
    • AI Study Assistant
    • Study Planner
    • Mock-Exams
    • Smart Note-Taking
    Join over 22 million students in learning with our StudySmarter App
    Sign up with Email