Today we’re going do something a little bit different by talking about the **median absolute deviation test for identifying outliers**!

**When Would You Use It?**

The median absolute deviation test for identifying outliers is used to determine whether or not a specific sore in a sample of n observations should be classified as an outlier.

**What Type of Data?**

The median absolute deviation test for identifying outliers requires interval or ratio data.

**Test Assumptions**

None listed.

**Test Process**

The equation employed in this test is as follows:

Step 1: Compute the median **M** of the dataset.

Step 2: Compute the median absolute deviation, or **MAD**. To do so:

a) Calculate the absolute values of the difference between each score and the median.

b) Arrange these absolute deviations in order from lowest to highest.

c) Find the median of these absolute deviations; this is the MAD value.

Step 3: Determine the **Max** value. While the selection of this value is somewhat arbitrary, a recommended value is to set Max = 5. This is because if the data are assumed to come from an approximately normal distribution, this value will be very likely to identify extreme or outlier scores.

Step 4: Plug in each X value into the equation to determine if it is an outlier. X is an outlier if the left-hand side of the equation exceeds the Max value. If doing this test by hand, the best way to go about this step is to start with the X that deviates the most from the median and work down from there, but if using a program, it’s easy enough to just test them all at once.

**Example**

Today’s data is from my 2013 music. I have the lengths (in seconds) of all n = 365 songs from that year, and I want to determine which values are outliers.

Computations:

M = 226

MAD = 36

With Max = 5, I found that the songs with the following lengths are outliers:

891

564

636

516

580

534

597

574

537

595

486

This was done using R; the code is below.

**Example in R**

x = read.table('clipboard', header=T) #data
M = median(x)
absdev = abs(x-M)
MAD = median(absdev)
Max = 5
for (i in 1:length(x)){ # if an x value is an outlier, this loop will
dev = (abs(x[i]-M))/MAD # print its value
if (dev > 5) { print(x[i]) }}

### Like this:

Like Loading...

*Related*