At first, I was confident enough because the game looked easy for me, and every time I would be able to grab the bear with no problem at all. The thing is that every single time the claw just went loose and dropped my bear! After some weeks, tears burst out of my eyes when I was finally able to claim my prize, which I still treasure in my bedroom.
You might be wondering how this relates to probability distributions. Turns out that the claw machines are built so the prize is rarely obtained, no matter how precise your inputs are. In my stuffed bear predicament, I was doing a trial every Sunday until I got a success. In this context, the number of trials that I made until I got my success is represented by a random variable with geometric distribution.
Geometric Distribution Definition
When talking about probability distributions you need to have a clear grasp of which is the random variable you are dealing with. Just like in the stuffed bear example, where I was counting how many times I had to play the claw machine, in a geometric distribution you count how many trials you perform until you obtain a success. It is assumed that each trial is a Bernoulli trial.
Remember that a Bernoulli trial only has two outcomes: success or failure.
It is time to properly define the geometric distribution.
The geometric distribution, also known as the geometric probability model, is a discrete probability distribution where the random variable \( X\) counts the number of trials performed until a success is obtained.
Since the least amount of trials required to obtain a success is \(1\), then the random variable \(X\) can take the values
\[ X=1,2,3, \dots\]
The geometric distribution has only one parameter, which is the probability \(p\) of success. A geometric distribution with probability \(p\) is usually denoted
\[\text{Geom}(p),\]
or sometimes it is written as
\[ G(p).\]
In my stuffed bear example, the random variable \(X\) counted how many times I performed the trial of playing the claw machine until I got my hands on the bear. The probability of success, \(p\), was not known to my person, but in most cases you will be given this value.
A probability distribution needs to satisfy the following requirements in order to fit a geometric model:
There are only two possible outcomes for each trial, success or failure. For example, the first trial could either be a success or a failure, just like all subsequent trials. It is worth noting that the experiment stops once you get a success.
The trials are independent of each other. For example, if the second trial is a failure this will not affect the next trial, or any subsequent trials, in any way.
The success probability remains unchanged trial after trial. This means the probability of success for the first trial is the same for all subsequent trials. For example, if \(p = 0.4\) then the probability of success of the first trial is \(0.4\), the probability of success of the second trial is \(0.4\) as well, and so on.
It is worth noting that if \(p<1\), it is in theory possible that you never obtain success even if you do a large amount of trials. This is easier to picture if \(p\) is a very small number.
Suppose you buy a lottery ticket every month. The chances of actually winning the lottery are astronomically small, so it is most likely that you will never win the prize. How sad!
Formulas Used in the Geometric Distribution
Usually, when you are given a geometric distribution you will be also given some formulas to find certain values of interest.
Probability Mass Function
Since in a geometric distribution you are counting how many trials you take until getting a success, a natural question that arises is: What is the probability of getting the success in exactly \( x\) trials? This can be found by noting that, if you underwent \(x\) trials until you got the success, then you had \(x-1\) failures, so
\[ P(X=x) = (1-p)^{x-1}p,\]
where \(p\) is the probability of success, and \(1-p\) is the probability of failure. You might also find this formula written as
\[ P(X=x) = q^{x-1}p,\]
where \(q=1-p\).
Figure 1. Graph of the probability mass function of the geometric distribution
Cumulative Distribution Function
You can find a more realistic approach to an experiment by looking at the cumulative distribution function of the geometric distribution, which tells you the probability of getting success in \(x\) trials or less. For the geometric distribution, this is given by
\[P(X\leq k) = 1-(1-p)^k.\]
Think of the stuffed bear example. Suppose you go to the claw machine with five spare quarters, the cumulative distribution function will tell you the probability of having at least one success with those five quarters, that is
\[ P(X \leq 5) = 1-(1-p)^5.\]
Figure 2. Graph of the cumulative distribution function of the geometric distribution
Expected Value
The expected value (also known as mean) of the geometric distribution gives you a rough estimate of how many trials you will need to do until you get a success, and it is given by
\[ \mu = \frac{1}{p}.\]
Standard Deviation
The standard deviation, in general, gives you insight on how a variable tends to stay around the expected value. A geometric distribution with a small standard deviation expects the number of trials to be close to the mean. It is given by
\[\sigma = \sqrt{\frac{1-p}{p^2}}.\]
Variance of the Geometric Distribution
Sometimes you will be asked to find the variance of an experiment modeled by a geometric distribution. To make things simple, since the standard deviation is the square root of the variance, you can obtain the variance by squaring the standard deviation. That is if the standard deviation is given by
\[ \sigma = \sqrt{\frac{1-p}{p^2}}\]
then, the variance is given by
\[ \sigma^2 = \frac{1-p}{p^2}.\]
The Geometric Distribution vs. the Exponential Distribution
Because the graph of a geometric distribution looks like a decreasing exponential function, you might associate a geometric distribution with an exponential distribution.
Figure 3. An exponential function that passes through the points of the graph of the probability mass function of a geometric distribution
The exponential distribution is quite similar to the geometric distribution in the sense that it models the time-lapse of an experiment until success is obtained. However, because time is considered a continuous quantity, the exponential distribution is a continuous probability distribution, while the geometric distribution is discrete.
Geometric Distribution Examples
Here you can solve some problems that can be modeled using the geometric distribution.
A patient suffers kidney failure and requires a transplant from a suitable donor. The probability that a random donor will match this patient’s requirements is \(0.2\).
- Suppose that no donor matches the patient's requirements until a fifth donor comes in. What is the probability of this scenario?
- Find the probability of the patient requiring \(10\) or fewer donors until a match is found.
- What is the expected number of donors required to get a match?
- Find the standard deviation of this scenario.
Solution:
- Whenever you need to find the probability that the experiment requires an exact number of trials to succeed, you should start by writing its probability mass function. In this case, since \(p=0.2\) then\[ \begin{align} P(X=x) &= (1-p)^{x-1}p \\ &= (1-0.2)^{x-1}(0.2) \\ &= (0.8)^{x-1}(0.2). \end{align}\]Now, you can evaluate the above function when \(x=5\), giving you\[ \begin{align} P(X=5) &= (0.8)^{5-1}(0.2) \\ &= (0.8)^4(0.2) \\ &= 0.08192, \end{align}\]which means that the probability that this scenario happens is \( 8.192 \%\).
- This time you will need the cumulative distribution function, which in this case is given by\[ P(X\leq k) = 1-(1-p)^k.\]Since you are looking for the case where ten or fewer donors are required, you need to plug in \(k=10\) into the above formula (and \(p=0.2\) as well), which will give you\[ \begin{align} P(X\leq 10) &= 1-(1-0.2)^{10} \\ &= 1-(0.8)^{10} \\ &= 0.892625, \end{align}\]so the probability of finding a suitable kidney from ten random donors is of about \(89.26 \%\).
- This is a rather straightforward task. For the expected number of donors you should use the formula for the expected value, so\[ \mu = \frac{1}{p}.\]By substituting \(p=0.2\) you will obtain\[ \begin{align} \mu &= \frac{1}{0.2} \\ &=5. \end{align}\]
- Finally, you can find the standard deviation by using the formula\[ \sigma = \sqrt{\frac{1-p}{p^2}}.\]Substituting \(p=0.2\) will give you\[ \begin{align} \sigma &= \sqrt{ \frac{1-0.2}{0.2^2} } \\ &= \sqrt{20} \\ &= 4.472133. \end{align}\]
You are likely to find the geometric distribution when playing board games!
Suppose you roll a fair dice until you get a three as a result.
- What is the probability that you don't roll a three until your fourth roll?
- Find the probability of getting the three you need in less than \(10\) rolls.
- What is the expected number of rolls required to get your desired outcome?
- Find the variance of this experiment.
Solution:
- In this case you need to find the probability of getting the success. Since you are using a fair dice, the odds of getting either number are all equal, so \[ p = \frac{1}{6}\]for obtaining any specific number, which includes getting three as a result. Now that you know \(p\), you can write the probability mass function for this geometric experiment, that is\[ \begin{align} P(X=x) &= (1-p)^{x-1}p \\ &= \left( 1- \frac{1}{6} \right)^{x-1} \left( \frac{1}{6} \right) \\ &= \left( \frac{5}{6} \right) ^{x-1} \left( \frac{1}{6} \right). \end{align} \] Finally, evaluate the above expression when \(x=4\), obtaining\[ \begin{align} P(X=4) &= \left( \frac{5}{6} \right) ^{4-1} \left(\frac{1}{6} \right) \\&= 0.0964506. \end{align}\]This means the probability that you don't get a three until your fourth roll is \( 9.645 \% \).
- For this case you will need the cumulative distribution function, which in this case is\[ P(X\leq k)=1-(1-p)^k.\]Here you are asked to find the probability of getting the success in less than \(10\) rolls, which means \(9\) rolls or less, so \( k=9\). Knowing this, you can substitute \(k\) and \(p\) to find the requested probability, so\[ \begin{align} P(X\leq 9) &= 1-\left(1-\frac{1}{6} \right)^9 \\ &= 1-\left(\frac{5}{6}\right)^9 \\ &= 0.806193. \end{align} \]So the probability of getting your desired result in less than \(10\) rolls is \( 80.6193 \% \).
- You can use the formula\[ \mu = \frac{1}{p}\]to find the expected value, so\[ \mu = \frac{1}{\frac{1}{6}}, \] which you can simplify with the properties of fractions, giving you\[ \mu = 6.\]
- This time you can use the variance formula,\[ \sigma^2 = \frac{1-p}{p^2},\]so\[ \begin{align} \sigma^2 &= \frac{1-\frac{1}{6}}{\left(\frac{1}{6}\right)^2} \\ &=\frac{\frac{5}{6}}{\frac{1}{36}} \\ &= 30. \end{align} \]
Let's assign a number to the probability of succeeding in the claw machine game.
Suppose that the probability of winning an item from a claw machine is \( 0.05\).
- What is the probability of winning an item on your first try?
- What is the probability of winning an item in less than \(20\) tries?
- Suppose you need to use a quarter for each try. What is the expected amount of money spent for getting a prize?
Solution:
- This is a tricky question! You can try building the probability mass function and using \(x=1\), but you are already told that the probability of winning an item from the claw machine is \(0.05\), or \( 5\%\), so this is the answer.
- As usual, build the cumulative distribution function, so\[ P(X\leq k) = 1-(1-p)^k.\]You need to find the probability of wining an item in less than \(20\) tries, which means \(19\) or less tries. So \(k=19\). Knowing this, evaluate the cumulative distribution function, that is\[ \begin{align} P(X\leq k) &= 1-(1-0.05)^{19} \\ &= 1-(0.95)^{19} \\&=0.622646.\end{align}\] So, the probability of winning a prize in less than \(20\) tries is \(62.2646\%\).
- Whenever you are asked about expectations, you should begin by finding the expected value. In this case this means that\[ \begin{align} \mu &= \frac{1}{\mu} \\ &= \frac{1}{0.05} \\ &= 20. \end{align}\]This means that you can expect to play the claw machine about \(20\) times. Since each time you play costs you a quarter, you need \(20\) quarters, so\[20(0.25) = 5\]means that you can expect to spend \($5\) on the claw machine.
Geometric Distribution - Key takeaways
- The geometric distribution, also known as the geometric probability model, is a discrete probability distribution where the random variable \( X\) counts the number of trials performed until a success is obtained.
Since the least amount of trials required to obtain a success is \(1\), then the random variable \(X\) can take the values \( X=1,2,3, \dots\).
In order to model a situation using a geometric distribution, you need to make some assumptions: 1. There are only two possible outcomes of a trial, a success or a failure. 2. The trials are independent of each other. 3. The success probability remains unchanged trial after trial.
The formulas used in geometric distributions are the following:
The probability mass function is given by\[ P(X=x) = (1-p)^{x-1}p.\]
The cumulative distribution function is\[ P(X \leq k) = 1-(1-p)^k.\]
The expected value can be found as\[ \mu = \frac{1}{p}.\]
The standard deviation is\[ \sigma = \sqrt{\frac{1-p}{p^2}}.\]
The exponential distribution is similar to the geometric distribution in the sense that both describe situations in which you are looking for the first success of a trial. However, the exponential distribution is a continuous distribution, while the geometric distribution is a discrete distribution.
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel