# Boxplots

## Key Questions

• The 5-number summary gives you the points on the boxplot.

First you put your cases in order of value/score.

Minimum (MIN) -- lowest value
First Quartile (Q1) -- the value when you reach 25% of your cases.
Median (MED or Q2) -- the value of the middle case
Third Quartile (Q3) -- when you are at 75% of your cases
Maximum (MAX) -- highest value

The 'Box' thus has 50% of all cases.

The picture is in Dutch (Wikipedia), but it should be clear. At the right side it makes the distinction between "largest non-extreme" (the line) and the maximum (the x). Because sometimes it is more realistic to leave some of the extremes out of the picture.

• One would often look at the IQR (Interquartile Range) to get a more "Realist" look at the data, as it would eliminate the outliers in our data.

Thus if you had a data set such as
$4 , 6 , 5 , 7 , 2 , 6 , 4 , 8 , 2956$

Then if we had to take the mean of just our IQR it would be more "Realistic" to our data set, as if we just took the normal mean, that one value of $2956$ will mess up the data quite a bit.

an outlier as such could come from something as simple as a typo error, so that shows how it can be useful to check the IQR

