8

I am trying to calculate the mean average deviation of a sample ("S") of numbers. The results I get when using the mad() function and when making the mean average deviation calculations one step at a time are different. Why?

 s<- c(100,110,114,121,130,130,160)

Using the mad() function, I get:

> mad(s)
[1] 13.3434

When breaking down the formula and doing the same operation one step at a time, I get:

> sum(abs(s-mean(s)))/length(s)
[1] 14.08163

Why do these results differ?

Am I making an error when entering my formula? (This would not be surprising- I am just starting to learn R). What is wrong with my formula?

Or is the formula that R uses to calculate the mean average deviation different from the following (given on Wikipedia)

MAD = (sum of (absolute values of (each value minus average value for sample))) divided by (the number of values in the sample)?

starball
  • 20,030
  • 7
  • 43
  • 238

3 Answers3

11

"MAD" is unfortunately a term with multiple meanings; mean absolute deviation from the mean (sometimes just called the MD or mean deviation), median absolute deviation from the median, mean absolute deviation from the median (which arises when computing scale in a Laplace), etc. Wikipedia -- while often useful -- is not the arbiter of usage; it can sometimes be a little idiosyncratic in its use of terms (that's not particularly a criticism of Wikipedia; it's partly inherent in the nature of the thing). [Personally in the absence of further clues I'd usually interpret MAD as median absolute deviation from the median, and expect mean absolute deviation from the mean if not written in full to be written either as "mean deviation"/"MD" or "mean absolute deviation".]

The question of which R is computing is resolved by the simple expedient of ?mad:

 mad {stats}    R Documentation

 Median Absolute Deviation

 Description

 Compute the median absolute deviation, i.e., the (lo-/hi-) median of the 
 absolute deviations from the median, and (by default) adjust by a factor 
 for asymptotically normal consistency.

Just as a general suggestion, when using a function for the first time, don't assume you know what it's doing. For example, before I read the help for MAD for the first time, I wouldn't have expected it to multiply by that constant as default. (I think that's a bad idea, since that means by default it doesn't actually compute anything called MAD, but instead a robust estimate of σ for a population where the uncontaminated part is Gaussian -- but that's how it works.)

Most functions will do what you think they do, but a few may surprise you. Check the definitions in the help, look at how the inputs and outputs are defined, and try the examples.

Incidentally if you want median (absolute) deviation from the mean, you could get that by mad(x,mean(x),1). But if you want mean deviation from the mean, I don't know if there's anything simpler to write than mean(abs(x-mean(x))); it has at least the advantage of being utterly explicit.

Glen_b
  • 7,883
  • 2
  • 37
  • 48
  • Thank you for your answer! I was trying to arrive at the mean deviation from the mean (for a sample/vector (X<-), and didn't realize that mad(X) – Larix.laricina Jun 28 '15 at 02:49
3

As @Glen_b suggested, mad does more than applying of a formula, including a "correction" for consistency with normality.

Look a the examples:

#with mad
mad(s)
mad(s,center= mean(s))

# using formulas
sum(abs(s-median(s)))/length(s)
sum(abs(s-mean(s)))/length(s)

> mad(s)
[1] 13.3434
> mad(s,center= mean(s))
[1] 14.1906
> 
> sum(abs(s-median(s)))/length(s)  
[1] 13.71429
> sum(abs(s-mean(s)))/length(s)
[1] 14.08163
Robert
  • 5,038
  • 1
  • 25
  • 43
  • Thank you! I didn't realize that the mad() did anything beyond what the basic mean average deviation formula calls for. You and glen_b have completely answered my question. – Larix.laricina Jun 28 '15 at 03:20
1

As an extra, if you are trying to compute median absolute deviation from the median, type

mad(s,constant=1)