Outlier detection in small sets

Question

Is there a good algorithm for detecting outliers in small sets of decimal numbers? The best idea I have come up with so far is a kind of recursive standard deviation based approach, but it seems a bit computationally expensive.

I'm using c++, so any existing functionality in say Boost or other maths helper libraries is welcome in your answers.

Thanks.

it seems you got the wrong stack* site... are you looking for math...? http://math.stackexchange.com/ — elcuco, Dec 04 '13 at 21:04
@elcuco I think its on topic for SO, since the op mentioned computational efficiency. — ApproachingDarknessFish, Dec 04 '13 at 21:07
just how "small" are these sets? 1/5/10 - which one's the outlier? — Marc B, Dec 04 '13 at 21:08
@ValekHalfHeart while I think that this is a great question... I do think that he will get better answers in a dedicated site with math people. — elcuco, Dec 04 '13 at 21:08
According to WIKI "There is no rigid mathematical definition of what constitutes an outlier; determining whether or not an observation is an outlier is ultimately a subjective exercise." So you probably need to define criteria and then ask for implementation. — Slava, Dec 04 '13 at 21:12
@elcuco http://stats.stackexchange.com/ would be a good site also. — Geobits, Dec 04 '13 at 21:12
You can do it in O(n) time with an online variance algorithm (http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm) and then a second pass to mark outliers. — IdeaHat, Dec 04 '13 at 21:13

score 1 · Answer 1 · answered Dec 05 '13 at 19:32

1

You can do it in O(n) time with an online variance algorithm (http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm) and then a second pass to mark outliers.

answered Dec 05 '13 at 19:32

IdeaHat

7,641
1
22
53

Outlier detection in small sets

1 Answers1