Which measure will be affected by an outlier the most?

a) mean
b) median
c) range
d) mode

1 Answer
Sep 27, 2017

Range

Explanation:

An outlier is a data point that is distant from the other observations. For instance, in a data set of #{1,2,2,3,26}#, 26 is an outlier. There is a formula to determine the range of what isn't an outlier, but just because a number doesn't fall in that range doesnt necessarily make it an outlier, as there may be other factors to consider.

The #color(red)(median)# is the middle number of a set of numerically ordered numbers. If the number of values in the set is odd, then the #color(red)(median)# is the central number, with equal amounts of data on both its left and its right. If the set has an even number of values, then the #color(red)(median)# is the average of the two central numbers. For example, in the set of #{1,2,3,4,5,6,7,8}#, there is an even amount of numbers, therefore we must find the mean of the two central numbers, which results in
#(5+4)/2=4.5#, the #color(red)(median)# .

The #color(green)("range")# #r# is the distance from the highest value to the lowest value, and is calculated as #r=h-l#, where #h# is the highest value, and #l# is the lowest value. So if we have a set of #{52,54,56,58,60}#, we get #r=60-52=8#, so the #color(green)("range")# is 8.

Given what we now know, it is correct to say that an outlier will affect the #color(green)(ran)##color(green)(g)##color(green)(e)# the most. This is because the #color(red)(median)# is always in the centre of the data and the #color(green)(ran)##color(green)(g)##color(green)(e)# is always at the ends of the data, and since the outlier is always an extreme, it will always be closer to the #color(green)(ran)##color(green)(g)##color(green)(e)# then the #color(red)(median)#.

For example, take the set #{1,2,3,4,100}#, with 100 as the outlier. The #color(green)(ran)##color(green)(g)##color(green)(e)# of this set is #r=100-1=99#, while the #color(red)(median)# is 3. If we take the outlier 100 out, so the set is now #{1,2,3,4}#, the #color(green)(ran)##color(green)(g)##color(green)(e)# becomes #4-1=3#, while the #color(red)(median)# becomes #(3+2)/2=2.5#. Evidently, it was the #color(green)(ran)##color(green)(g)##color(green)(e)# which was affected the most.

https://mathspace.co/learn/world-of-maths/univariate-data/effects-of-outliers-12017/things-out-of-the-norm-601/

I hope I helped!