Suppose your teacher provided a list of \(300\) exercises in preparation for the final exam. The teacher assures you that the exam will have \(10\) questions, and they will be taken from the list provided.
Although you prepared well in advance, you only managed to solve \(200\) exercises. What the probability is that the teacher will choose \(10\) questions that you have solved?
This type of question can be answered using the binomial distribution, and in this article you will learn more about it.
What is a binomial distribution?
A binomial distribution is a discrete probability distribution used to calculate the probability of observing a certain number of successes in a finite number of Bernoulli trials. A Bernoulli trial is a random experiment where you can only have two possible outcomes that are mutually exclusive, one of which is called success and the other failure.
If \(X\) is a binomial random variable with \(X\sim \text{B}(n,p)\), then the probability of getting exactly \(x\) successes in \(n\) independent Bernoulli trials is given by the probability mass function:
\[P(X=x)={n\choose{x}}p^x(1-p)^{n-x}\]
for \(x=0,1,2,\dots , n\), where
\[\displaystyle {n\choose{x}}=\frac{n!}{x!(n-x)!}\]
are known as the binomial coefficient.
Visit our article Binomial Distribution for more details about this distribution.
Let's look at an example to see how to calculate the probabilities in a binomial distribution.
Suppose you are going to take a multiple choice test with \(10\) questions, where each question has \(5\) possible answers, but only \(1\) option is correct. If you had to guess randomly on each question.
a) What is the probability that you would guess exactly \(4\) correct?
b) What is the probability that you would guess \(2\) or less correctly?
c) What is the probability that you would guess \(8\) or more correctly?
Solution:First, let's note that there are \(10\) questions, so \(n=10\). Now, since each question has \(5\) choices and only \(1\) is correct, the probability of getting the correct one is \(\dfrac{1}{5}\), so \(p=\dfrac{1}{5}\). Therefore,
\[1-p=1-\dfrac{1}{5}=\frac{4}{5} .\]
a) The probability of getting exactly \(4\) correct is given by
\[\begin{align} P(X=4)&={10\choose{4}}\left(\frac{1}{5}\right)^4\left(\frac{4}{5}\right)^{6} \\ &\approx 0.088. \end{align}\]
b) The probability of getting \(2\) or less correct is given by
\[\begin{align} P(X\leq 2)&=P(X=0)+P(X=1)+P(X=2) \\ &= {10\choose{0}} \left(\frac{1}{5}\right)^0\left(\frac{4}{5}\right)^{10}+{10\choose{1}}\left(\frac{1}{5}\right)^1\left(\frac{4}{5}\right)^{9}\\ &\quad +{10\choose{2}}\left(\frac{1}{5}\right)^2\left(\frac{4}{5}\right)^{8} \\ &\approx 0.678.\end{align}\]
c) The probability of getting \(8\) or more correct is given by \[\begin{align} P(X\geq 8)&=P(X=8)+P(X=9)+P(X=10) \\ &= {10\choose{8}} \left(\frac{1}{5}\right)^8\left(\frac{4}{5}\right)^{2}+{10\choose{9}}\left(\frac{1}{5}\right)^9\left(\frac{4}{5}\right)^{1} \\ & \quad+{10\choose{10}}\left(\frac{1}{5}\right)^{10}\left(\frac{4}{5}\right)^{0} \\ &\approx 0.00008.\end{align}\]
In other words, guessing the answers is a very bad test strategy if that is all you are going to do!
Derivation of mean and variance of binomial distribution
Note that a binomial variable \(X\) is the sum of \(n\) independent Bernoulli trials with the same probability of success \(p\), that means \(X=X_1+X_2+\ldots+X_n\), where each \(X_i\) is a Bernoulli variable. Using this, let's see how to derive the formulas for the mean and variance.
Derivation of mean of binomial distribution
To calculate the expected value of \(X\), from the above you have
\[\text{E}(X)=\text{E}(X_1+X_2+\ldots+X_n),\]
as the expected value is linear
\[\text{E}(X_1+X_2+\ldots+X_n)=\text{E}(X_1)+\text{E}(X_2)+\ldots+\text{E}(X_n).\]
Finally, recall that for a Bernoulli variable \(Y\) with probability of success \(q\), the expected value is \(q\). Thus,
\[\text{E}(X_1)+\text{E}(X_2)+\ldots+\text{E}(X_n)=\underbrace{p+p+\ldots+p}_{n\text{ times}}=np.\]
Putting everything together, you have the previously mentioned formula
\[\text{E}(X)=np.\]
Derivation of variance of binomial distribution
To calculate the variance of \(X\), you have
\[\text{Var}(X)=\text{Var}(X_1+X_2+\ldots+X_n),\]
using that the variance is additive for independent variables
\[\begin{align} \text{Var}(X_1+X_2+\ldots+X_n)&=\text{Var}(X_1)+\text{Var}(X_2) \\ &\quad +\ldots+\text{Var}(X_n). \end{align}\]
Again, recall that for a Bernoulli variable \(Y\), with probability of success \(q\), the variance is \(q(1-q)\). Then,
\[\begin{align} \text{Var}(X) &= \text{Var}(X_1)+\text{Var}(X_2)+\ldots+\text{Var}(X_n)\\ &= \underbrace{p(1-p)+p(1-p)+\ldots+p(1-p)}_{n\text{ times}} \\ & =np(1-p).\end{align}\]
Putting it all together,
\[\text{Var}(X)=np(1-p).\]
Mean and standard deviation for a binomial distribution
In the previous section you saw that the mean of the binomial distribution is
\[\text{E}(X)=np,\]
and the variance is
\[\text{Var}(X)=np(1-p).\]
To obtain the standard deviation, \(\sigma\), of the binomial distribution, just take the square root of the variance, so
\[\sigma = \sqrt{np(1-p) }.\]
Formula for mean of binomial distribution
The mean of a variable is the average value expected to be observed when an experiment is performed multiple times.
If \(X\) is a binomial random variable with \(X\sim \text{B}(n,p)\), then the expected value or mean of \(X\) is given by \[\text{E}(X)=\mu=np.\]
Formula for variance of a binomial distribution
The variance of a variable is a measure of how different the values are from the mean.
If \(X\) is a binomial random variable with \(X\sim \text{B}(n,p)\), then:
For a more detailed explanation of these concepts, please review our article Mean and Variance of Discrete Probability Distributions.
Examples of mean and variance of binomial distribution
Let's look at some examples, starting with a classic one.
Let \(X\) be a random variable such that \(X\sim \text{B}(10,0.3)\). Find the mean \(\text{E}(X)\) and the variance \(\text{Var}(X)\).
Solution:
Using the formula for the mean, you have
\[\text{E}(X)=np=(10)(0.3)=3.\]
For the variance you have
\[\text{Var}(X)=np(1-p) =(10)(0.3)(0.7)=2.1.\]
Let's take another example.
Let \(X\) be a random variable such that \(X\sim \text{B}(12,p)\) and \(\text{Var}(X)=2.88\). Find the two possible values of \(p\).
Solution:
From the variance formula, you have
\[\text{Var}(X)=np(1-p)=2.88.\]Since you know \(n=12\), substituting it in the above equation gives
\[12p(1-p)=2.88,\]
which is the same as
\[p(1-p)=0.24\]
or
\[p^2-p+0.24=0.\]
Note that you now have a quadratic equation, so using the quadratic formula you get that the solutions are \(p=0.4\) and \(p=0.6\).
The previous example shows that you can have two different binomial distributions with the same variance!
Finally, note that by using the mean and variance of a variable, you can recover its distribution.
Let \(X\) be a random variable such that \(X\sim \text{B}(n,p)\), with \(\text{E}(X)=3.6\) and \(\text{Var}(X)=2.88\).
Find the values of \(n\) and \(p\).
Solution:
Recall that by the formulas of the mean and variance
\[\text{E}(X)=np=3.6\]
and
\[\text{Var}(X)=np(1-p)=2.88.\]
From here, substituting you have
\[3.6(1-p)=2.88,\]
which implies that
\[1-p=\frac{2.88}{3.6}=0.8.\]
Therefore, \(p=0.2\) and again, from the formula of the mean, you have
\[n=\frac{3.6}{0.2}=18.\]
So the original distribution is \(X\sim \text{B}(18,0.8)\).
Mean and Variance of Binomial Distribution - Key takeaways
If \(X\) is a binomial random variable with \(X\sim \text{B}(n,p)\). Then, \[P(X=x)={n\choose{x}}p^x(1-p)^{n-x}\]for \(x=0,1,2,\dots,n\) where \[\displaystyle {n\choose{x}}=\frac{n!}{x!(n-x)!}\]
If \(X\sim \text{B}(n,p)\), then the expected value or mean of \(X\) is \(\text{E}(X)=\mu=np\).
If \(X\sim \text{B}(n,p)\), then the variance is \(\text{Var}(X)=\sigma^2=np(1-p) \) and the standard deviation is \(\sigma=\sqrt{np(1-p)}\).
How we ensure our content is accurate and trustworthy?
At StudySmarter, we have created a learning platform that serves millions of students. Meet
the people who work hard to deliver fact based content as well as making sure it is verified.
Content Creation Process:
Lily Hulatt is a Digital Content Specialist with over three years of experience in content strategy and curriculum design. She gained her PhD in English Literature from Durham University in 2022, taught in Durham University’s English Studies Department, and has contributed to a number of publications. Lily specialises in English Literature, English Language, History, and Philosophy.
Get to know Lily
Content Quality Monitored by:
Gabriel Freitas is an AI Engineer with a solid experience in software development, machine learning algorithms, and generative AI, including large language models’ (LLMs) applications. Graduated in Electrical Engineering at the University of São Paulo, he is currently pursuing an MSc in Computer Engineering at the University of Campinas, specializing in machine learning topics. Gabriel has a strong background in software engineering and has worked on projects involving computer vision, embedded AI, and LLM applications.
Get to know Gabriel