# Which measure will be affected by an outlier the most?

## a) mean b) median c) range d) mode

Sep 27, 2017

Range

#### Explanation:

An outlier is a data point that is distant from the other observations. For instance, in a data set of $\left\{1 , 2 , 2 , 3 , 26\right\}$, 26 is an outlier. There is a formula to determine the range of what isn't an outlier, but just because a number doesn't fall in that range doesnt necessarily make it an outlier, as there may be other factors to consider.

The $\textcolor{red}{m e \mathrm{di} a n}$ is the middle number of a set of numerically ordered numbers. If the number of values in the set is odd, then the $\textcolor{red}{m e \mathrm{di} a n}$ is the central number, with equal amounts of data on both its left and its right. If the set has an even number of values, then the $\textcolor{red}{m e \mathrm{di} a n}$ is the average of the two central numbers. For example, in the set of $\left\{1 , 2 , 3 , 4 , 5 , 6 , 7 , 8\right\}$, there is an even amount of numbers, therefore we must find the mean of the two central numbers, which results in
$\frac{5 + 4}{2} = 4.5$, the $\textcolor{red}{m e \mathrm{di} a n}$ .

The $\textcolor{g r e e n}{\text{range}}$ $r$ is the distance from the highest value to the lowest value, and is calculated as $r = h - l$, where $h$ is the highest value, and $l$ is the lowest value. So if we have a set of $\left\{52 , 54 , 56 , 58 , 60\right\}$, we get $r = 60 - 52 = 8$, so the $\textcolor{g r e e n}{\text{range}}$ is 8.

Given what we now know, it is correct to say that an outlier will affect the $\textcolor{g r e e n}{r a n}$$\textcolor{g r e e n}{g}$$\textcolor{g r e e n}{e}$ the most. This is because the $\textcolor{red}{m e \mathrm{di} a n}$ is always in the centre of the data and the $\textcolor{g r e e n}{r a n}$$\textcolor{g r e e n}{g}$$\textcolor{g r e e n}{e}$ is always at the ends of the data, and since the outlier is always an extreme, it will always be closer to the $\textcolor{g r e e n}{r a n}$$\textcolor{g r e e n}{g}$$\textcolor{g r e e n}{e}$ then the $\textcolor{red}{m e \mathrm{di} a n}$.

For example, take the set $\left\{1 , 2 , 3 , 4 , 100\right\}$, with 100 as the outlier. The $\textcolor{g r e e n}{r a n}$$\textcolor{g r e e n}{g}$$\textcolor{g r e e n}{e}$ of this set is $r = 100 - 1 = 99$, while the $\textcolor{red}{m e \mathrm{di} a n}$ is 3. If we take the outlier 100 out, so the set is now $\left\{1 , 2 , 3 , 4\right\}$, the $\textcolor{g r e e n}{r a n}$$\textcolor{g r e e n}{g}$$\textcolor{g r e e n}{e}$ becomes $4 - 1 = 3$, while the $\textcolor{red}{m e \mathrm{di} a n}$ becomes $\frac{3 + 2}{2} = 2.5$. Evidently, it was the $\textcolor{g r e e n}{r a n}$$\textcolor{g r e e n}{g}$$\textcolor{g r e e n}{e}$ which was affected the most.

https://mathspace.co/learn/world-of-maths/univariate-data/effects-of-outliers-12017/things-out-of-the-norm-601/

I hope I helped!