0

My sample is a panel data set, consisting of several variables and several time periods. i detect and treat outliers with the iqr method. that is, outliers are observations above/below the third/first quartile plus/minus 1.5 times the iqr. as everybody knows, this can be visualised in boxplots.

for conceptual reasons, the whole sample is taken into account when detecting and treating outliers: the quartiles and the iqr of a variable are based on the total time series. when visualising outliers with the simple boxplot function, the quartiles and iqr of each period are plotted (see attached plot). however, i would like to create a plot that correctly illustrates my outlier detection and treatment method. i.e., the median, boxes, and whiskers should be constant over time. i don't want to summarise the data into one plot because the year to which an outlier belong is not observable anymore.

enter image description here

i guess i have to create the boxplot with ggplot2?

d.b
  • 32,245
  • 6
  • 36
  • 77
Mimi
  • 9
  • 1
  • Does this "the median, boxes, and whiskers should be constant over time" mean that you want one box plot replicated and plotted by the number of years? – Vitali Avagyan Sep 21 '19 at 19:27
  • yes, i want one boxplot replicated and plottet by the number of years. the difference from one year to another year is the outliers. in the provided figure, i would assume the boxes to look approximately like the box in 2011. then for each year, outliers at the top remain as they are and in 2009, an additional outlier is displayed. Thank you! :) – Mimi Sep 23 '19 at 07:30

0 Answers0