Measures of Variability

Add yours

Sorry, we don't have any videos for this topic yet.
Let teachers know you need one by requesting it

Log in so we can tell you when a lesson is added.

Key Questions

  • SD: it gives you an numerical value about the variation of the data.
    Range: it gives you the maximal and minimal values of all data.

    Mean: a pontual value that represents the average value of data. Doesn't represent the true in assimetrical distributions and it is influenced by outliers

  • Answer:

    In the formula for a population standard deviation, you divide by the population size #N#, whereas in the formula for the sample standard deviation, you divide by #n-1# (the sample size minus one).

    Explanation:

    If #mu# is the mean of the population, the formula for the population standard deviation of the population data #x_{1},x_{2},x_{3},\ldots, x_{N}# is

    #sigma=sqrt{\frac{sum_{k=1}^{N}(x_{k}-mu)^{2}}{N}}#.

    If #bar{x}# is the mean of a sample, the formula for the sample standard deviation of the sample data #x_{1},x_{2},x_{3},\ldots, x_{n}# is

    #s=sqrt{\frac{sum_{k=1}^{n}(x_{k}-bar{x})^{2}}{n-1}}#.

    The reason this is done is somewhat technical. Doing this makes the sample variance #s^{2}# a so-called unbiased estimator for the population variance #sigma^{2}#. In effect, if the population size is really large and you are doing many, many random samples of the same size #n# from that large population, the mean of the many, many values of #s^{2}# will have an average very close to the value of #sigma^{2}# (and, as far as a theoretical perspective goes, the mean of #s^{2}# as a "random variable" will be exactly #sigma^{2}#).

    The technicalities for why this is true involve lots of algebra with summations, and is usually not worth the time spent for beginning students.

  • Standard deviation is most widely used.

    Range simply gives the difference between lowest and highest value, and a few extreme values will alter the range excessively.

    The standard deviation #sigma# tells you where most of the values will be, and in a normal distribution 68% of all values will be within one standard deviation from the mean #mu#, and 95% will be within two standard deviations of the mean.

    Example:
    You have a filling machine that fills kilogram bags of sugar. It will not fill exactly #1000g# every time, the standard deviation is #10g#.
    Then you know, that #68%# is between #990and1010g#, and #95%# between #980and1020g#, a total span of #20g# or #40g# respectively.

    Every now and again a bag will be far over-filled (say #1100g#) and sometimes a bag will end up empty (#0g#), so the range will be a total of #1100g#.

    You may decide which of the two gives a better idea of the spread in this distribution.

Questions