0

I am comparing several values using R, they are 8 variables stored in 1000 length vectors. That means, 1000*8 matrix, 8 columns represent 8 variables.

Then I call

boxplot(test),

I got like: The mean values of 8 variables are very close to each other. Which makes the comparison and interpretation very hard. Can I include all the outliers in my plot ? Then the whole range would be easier to compare ? Or any other suggestions could be given to distinguish these variables ?

GeekCat
  • 309
  • 5
  • 18
  • It is hard to understand exactly what you are asking. Boxplots will by default plot outliers, so what are you lacking? What are the variables you are trying to compare? – Stephan Kolassa Mar 14 '14 at 21:01
  • I havent got enough reputation to post a picture, but, the variables are the same drawn from different simulations with different parameters, can I just simply include all the outliers ? I dont know how to change the default. – GeekCat Mar 14 '14 at 21:16
  • Sorry, but that doesn't really help matters. `boxplot()` will already plot all your outliers, so what exactly is the problem? And how do you want to compare your variables? – Stephan Kolassa Mar 14 '14 at 21:19
  • https://drive.google.com/file/d/0B1HubY3HsuxaeHZHRi1Ud0V4dm8/edit?usp=sharing @StephanKolassa Would you mind to look at the picture in the link ? You will understand the problem, how to compare and interpret the 8 variables ? – GeekCat Mar 14 '14 at 21:23
  • The plot is broken, can you please re-upload? – Stephan Kolassa Mar 14 '14 at 21:27
  • https://drive.google.com/file/d/0B1HubY3HsuxaeHZHRi1Ud0V4dm8/edit?usp=sharing @StephanKolassa Can you see it ? It is in google drive. – GeekCat Mar 14 '14 at 21:31

1 Answers1

0

Here is the boxplot in question (since the OP doesn't have the rep to post pictures): boxplots

It looks like the medians (and likely also the means) are pretty much identical, but the variances differ between the eight categories, with category 1 having the lowest and 8 the highest variance. Depending on the real question involved, these two pieces of information (similar median/mean, different variance) may already be enough.

If you want a formal significance test whether the variances are equal, you can use Hartley's or Bartlett's test. If you want to formally test equality of means with unequal variances (so ANOVA is not appropriate), look here.

Community
  • 1
  • 1
Stephan Kolassa
  • 7,953
  • 2
  • 28
  • 48
  • Kplassa, Thanks for your answer, but you see the outliers ? Can the boxplot include all the outliers ? then the mean and median will be different possibly ? – GeekCat Mar 14 '14 at 22:04
  • Oh, I think I start to understand what you mean. You want the boxes to extend the entire range of your data? You can do that using `rect()`, with the `ybottom` and `ytop` parameter set to the min and max of each column, instead of `boxplot()`, adding horizontal `lines()` for the median. But I don't see how that would be more helpful than the boxplots. And of course the mean & median wouldn't change if you only *plot* your data differently. – Stephan Kolassa Mar 15 '14 at 08:14