3

I have a data set of length 15,000 with real values from 0 to 100. My data set is HEAVILY skewed to the left. I'm trying to accomplish the following bins: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, >10. What i have done so far is created the following:

  breakvector = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100) 

and have run:

  hist(datavector, breaks=breakvector, xlim=(0, 13))

However, it seems like this results in a histogram where data greater than 13 aren't included. Does anyone have any idea on how to get R to bin all the rest of the data in the last bin. Thanks in advance.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
mt88
  • 2,855
  • 8
  • 24
  • 42

1 Answers1

4

How about this

datavector<-c(sample(1:9, 40, replace=T), sample(10:100, 20, replace=T))
breakvector <- c(0:11)
hist(ifelse(datavector>10,11,datavector), breaks=breakvector, xlim=c(0, 13), xaxt="n")
axis(1, at=1:11-.5, labels=c(1:10, ">10"))

Rather than adjusting the breaks, i just throw all the values >10 into a bin for 11. Then i update the axis accordingly.

enter image description here

MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • This worked perfectly. Thank you very much. didn't know you could use an ifelse statement like that. – mt88 May 29 '14 at 23:26