Is there a function to detect individual outliers in longitudinal data in R?

Question

I have a dataset of 5,000 records and each of those records consists of a series of continuous measurements collected over a decade at various times. Each of the measurements was originally entered by manually and, as might be expected, there are a number of errors that need to be corrected.

Typically the incorrect data change by >50% from point to point, while data that is correct changes at most by 10% at any one time. If I visualize the data individually, these are very obvious in an X/Y plot with time on the X-axis.

It's not feasible to graph each of these individually, and I'm trying to figure out if there's a faster way to automate and flag the data that are obviously in error and need to be corrected/removed.

Does anyone have any experience with a problem like this?

If you need recommendations for statistical methods to identify outliers, you should ask over at [stats.se]. This isn't a very specific programming question that's appropriate for Stack Overflow. — MrFlick, Jun 21 '17 at 00:58
I should clarify that the "outliers" are not real, they are incorrectly entered data... I need a way to quickly visualize each of the records or automate determining which records have incorrect data... I agree the question is not very specific... I will try to revise and make more specific - thanks for your comment — Vance L Albaugh, Jun 21 '17 at 01:01
You could try some type of dummy variable with `dplyr::mutate()` and a logical condition, using `case_when()` or `if_else()`. So, if the value is above a certain threshold, this variable will be 1, let's say, otherwise 0. Then remove the 1s with `filter()`, assuming you want to take them out. — RobertMyles, Jun 21 '17 at 01:02
No function, but this answer might help you, https://stackoverflow.com/questions/21947091/how-to-winsorize-or-remove-univariate-outliers-in-a-longitudinal-dataset — AMS, Jul 22 '19 at 02:15

Is there a function to detect individual outliers in longitudinal data in R?

0 Answers0