0

I am working on data distribution which has following follwing points.

input<-read.table("infile",header=TRUE,sep="\t")

table(input)

0.786333        1  1.04453  1.06159  1.33277  1.53607  2.25893 
  49      938        1        1       36       16      166

if i plot box plot for it, i get single line for lowest datum, highest datum and median.

boxplot(input)

enter image description here

Is there any way to distribute points by normalization so that can have better boxplot with distinct boundary for lowest datum, highest datum and median?

Manish
  • 3,341
  • 15
  • 52
  • 87
  • Boxplot is complety wrong approach given your data, which have two clear peaks and only few other values. What do you really want to show with your figure? – Jouni Helske Mar 08 '13 at 05:29
  • I m planning to do http://stackoverflow.com/questions/13927473/how-to-plot-bar-plot-in-parallel-to-horizontal-to-box-plot-with-fraction-of-area – Manish Mar 08 '13 at 06:43
  • Well as you can see from your figure, your data does not suit for that kind of plot. – Jouni Helske Mar 08 '13 at 07:40

1 Answers1

2

You clearly have a biomodal distribution, I don't think a boxplot is a useful summary here

A density plot is more useful

plot(density(zz))

enter image description here

You could also consider a violin plot which is a bit of a mix between a kernel density plot and boxplot.

Using the vioplot package

 library(vioplot)
 violplot(zz)
mnel
  • 113,303
  • 27
  • 265
  • 254
  • 1
    But i need to classify them in category like we have 25%, 50% and 25% for boxplot. Can i do the same with density plot? – Manish Mar 08 '13 at 03:55
  • 2
    @user15662 you only have 7 unique values out of 1207 -- use those to create meaningful categories. That being said, the density plot clearly shows you could use a cutoff near 1.5 for two categories. – mnel Mar 08 '13 at 04:01
  • 1
    Actually i need to plot one barplot parallel to boxplot with something (high, medium, low). zz<-rnorm(1:1000) boxplot(zz), where box width(50%) is medium and on other side it is low(25%) and high(25%). I don't think i can do this with vioplot. – Manish Mar 08 '13 at 04:07