×

Hello! Socratic's Terms of Service and Privacy Policy have been updated, which will be automatically effective on October 6, 2018. Please contact hello@socratic.com with any questions.

How do you use a probability mass function to calculate the mean and variance of a discrete distribution?

1 Answer
Mar 4, 2017

Answer:

PMF for discrete random variable #X:" "# #p_X(x)" "# or #" "p(x)#.
Mean: #" "mu=E[X]=sum_x x*p(x)#.
Variance: #" "sigma^2 = "Var"[X]=sum_x [x^2*p(x)] - [sum_x x*p(x)]^2#.

Explanation:

The probability mass function (or pmf, for short) is a mapping, that takes all the possible discrete values a random variable could take on, and maps them to their probabilities. Quick example: if #X# is the result of a single dice roll, then #X# could take on the values #{1,2,3,4,5,6},# each with equal probability #1/6#. The pmf for #X# would be:

#p_X(x)={(1/6",", x in {1,2,3,4,5,6}),(0",","otherwise"):}#

If we're only working with one random variable, the subscript #X# is often left out, so we write the pmf as #p(x)#.

In short: #p(x)# is equal to #P(X=x)#.

The mean #mu# (or expected value #E[X]#) of a random variable #X# is the sum of the weighted possible values for #X#; weighted, that is, by their respective probabilities. If #S# is the set of all possible values for #X#, then the formula for the mean is:

#mu =sum_(x in S) x*p(x)#.

In our example from above, this works out to be

#mu = sum_(x=1)^6 x*p(x)#
#color(white)mu = 1(1/6)+2(1/6)+3(1/6)+...+6(1/6)#
#color(white)mu = 1/6(1+2+3+4+5+6)#
#color(white)mu = 1/6(21)#

#color(white)mu = 3.5#

The variance #sigma^2# (or #"Var"[X]#) of a random variable #X# is a measure of the spread of the possible values. By definition, it is the expected value of the squared distance between #X# and #mu#:

#sigma^2 = E[(X-mu)^2]#

With some simple algebra and probability theory, this becomes

#sigma^2 = E[X^2] - mu^2#

We already have a formula for #mu" "(E[X]),# so now we just need a formula for #E[X^2].# This is the expected value of the squared random variable, so our formula for this is the sum of the squared possible values for #X#, again, weighted by the probabilities of the #x#-values:

#E[X^2]=sum_(x in S) x^2*p(x)#

Using this, our formula for the variance of #X# becomes

#sigma^2 =sum_(x in S) [x^2*p(x)] - mu^2#
#color(white)(sigma^2) =sum_(x in S) [x^2*p(x)] - [sum_(x in S) x*p(x)]^2#

For our example, #mu# was calculated to be #3.5,# so we use that for our last term to get

#sigma^2 =sum_(x=1)^6 [x^2*p(x)] - mu^2#
#color(white)(sigma^2) =[1^2(1/6)+2^2(1/6)+...+6^2(1/6)] - (3.5)^2#
#color(white)(sigma^2) =1/6(1+4+9+16+25+36)" "-" "(3.5)^2#
#color(white)(sigma^2) =1/6(91)" "-" "12.25#
#color(white)(sigma^2) ~~ 15.167-12.25#
#color(white)(sigma^2) = 2.917#