1

Give my data

ID     Income
<dbl>  <dbl>
 1         0
 2      2000
 3      5000
 4      2500
 5      5000
 6      2000
 7      1000
 8     16000
 9         0
10      1000
...

There are about 100 rows this is a small sample. I tried to generate a table of frequency count to plot a histogram. In light of previous questions on this website, my codes are

options(scipen = 999)
Htable <- table(cut(df$Income, breaks=c(0,1,500,1000,2000,3000,5000,Inf),include.lowest=T))

But it returns

       [0,1]       (1,500]   (500,1e+03] (1e+03,2e+03] (2e+03,3e+03] 
           11             6            29            42             12 
(3e+03,5e+03]   (5e+03,Inf] 
            6             1

Do you know why this happens and how can I change it to standard expression?

Thank you very much!

xxx
  • 167
  • 1
  • 7
  • FYI - the "e" in 5e+03 is not the Euler number. It stands for exponent. See https://en.wikipedia.org/wiki/Scientific_notation#E_notation – dww Jul 19 '21 at 02:58

1 Answers1

1

It is not related to scipen. There is an option in cut - dig.lab which is 3 by default. We could change that

table(cut(df$Income, breaks=c(0,1,500,1000,2000,3000,5000,Inf),
       include.lowest=TRUE, dig.lab = 6))

-output

 [0,1]     (1,500]  (500,1000] (1000,2000] (2000,3000] (3000,5000]  (5000,Inf] 
          2           0           2           2           1           2           1 

-compared with the OP's option

> table(cut(df$Income, breaks=c(0,1,500,1000,2000,3000,5000,Inf),include.lowest=T))

        [0,1]       (1,500]   (500,1e+03] (1e+03,2e+03] (2e+03,3e+03] (3e+03,5e+03]   (5e+03,Inf] 
            2             0             2             2             1             2             1 

data

df <- structure(list(ID = 1:10, Income = c(0L, 2000L, 5000L, 2500L, 
5000L, 2000L, 1000L, 16000L, 0L, 1000L)), class = "data.frame", row.names = c(NA, 
-10L))
akrun
  • 874,273
  • 37
  • 540
  • 662