What is the difference between the population standard deviation and the sample standard deviation?

1 Answer
Jul 22, 2015

In the formula for a population standard deviation, you divide by the population size #N#, whereas in the formula for the sample standard deviation, you divide by #n-1# (the sample size minus one).

Explanation:

If #mu# is the mean of the population, the formula for the population standard deviation of the population data #x_{1},x_{2},x_{3},\ldots, x_{N}# is

#sigma=sqrt{\frac{sum_{k=1}^{N}(x_{k}-mu)^{2}}{N}}#.

If #bar{x}# is the mean of a sample, the formula for the sample standard deviation of the sample data #x_{1},x_{2},x_{3},\ldots, x_{n}# is

#s=sqrt{\frac{sum_{k=1}^{n}(x_{k}-bar{x})^{2}}{n-1}}#.

The reason this is done is somewhat technical. Doing this makes the sample variance #s^{2}# a so-called unbiased estimator for the population variance #sigma^{2}#. In effect, if the population size is really large and you are doing many, many random samples of the same size #n# from that large population, the mean of the many, many values of #s^{2}# will have an average very close to the value of #sigma^{2}# (and, as far as a theoretical perspective goes, the mean of #s^{2}# as a "random variable" will be exactly #sigma^{2}#).

The technicalities for why this is true involve lots of algebra with summations, and is usually not worth the time spent for beginning students.