41

I use cut and classIntervals to group data in R which I later plot with ggplot2. So a basic operation cutting by quantiles with n=3 would look like this:

library(classInt)

a<-c(1,10,100,1000,100000,1000000)
b<-cut(a, 
breaks=data.frame(
  classIntervals(
    a,n=3,method="quantile")[2])[,1],
include.lowest=T)

where b would be:

[1] [1,70]          [1,70]          (70,3.4e+04]    (70,3.4e+04]    (3.4e+04,1e+06] (3.4e+04,1e+06]
Levels: [1,70] (70,3.4e+04] (3.4e+04,1e+06]

so the first line of this output is a vector with my grouped data which I can use in ggplot2. But rather than having this vector in scientific notation I would like the labels to be [1,70] (70,34000] (3400,1000000]

How can I achive that?Any help would be appreciated, also if you have other methods rather than cut and classInt to achive the same result.

Joschi
  • 2,941
  • 9
  • 28
  • 36
  • 1
    If anybody uses similar functions to group the data, feel free to check out `cut2` from the `Hmisc` package which actually does the cutting bether than my function described above. See also: https://stat.ethz.ch/pipermail/r-help/2007-December/148468.html. in this case use `digits=10` to avoid scientific notations. – Joschi Mar 19 '13 at 11:58

1 Answers1

63

Use argument dig.lab in cut function:

a<-c(1,10,100,1000,100000,1000000)
b<-cut(a, 
breaks=data.frame(
  classIntervals(
    a,n=3,method="quantile")[2])[,1],
include.lowest=T,dig.lab=10) ##Number of digits used
b
[1] [1,70]          [1,70]          (70,34000]      (70,34000]     
[5] (34000,1000000] (34000,1000000]
Levels: [1,70] (70,34000] (34000,1000000]
Jouni Helske
  • 6,427
  • 29
  • 52
  • @Jouni Helske -- What would you propose if number is something like 10^-17? – novice Jul 19 '16 at 18:53
  • I'm in a somewhat similar situation as the OP. I'm running different quantities through a function that uses cut, and I occasionally see scientific notation in the labels, even for quantities that are integer. If I never want to see scientific notation, regardless of the quantities involved, is it safe for me to just set `dig.lab=50` (the maximum value allowed)? Thx! – sparc_spread Sep 17 '19 at 14:31
  • why is the input to `breaks` wrapped in a `data.frame`? `b<-cut(a, breaks= classIntervals(a,n=3,method="quantile")[[2]], include.lowest=T,dig.lab=10)` would do. – Ratnanil Mar 09 '22 at 06:30