For a simple example, to "bin" 1000 (continuous value) datapoints in 10 bins (categories), with 100 datapoints in each bin:
x <- rnorm(1000, mean=0, sd=50)
# Next, let's say we want to create ten bins
# with equal number of observations (100), in each bin:
bins <- 10
cutpoints <- quantile(x,(0:bins)/bins)
# The cutpoints variable
# holds a vector of the cutpoints used to bin the data.
# Finally we perform the binning to form the categories variable:
binned <- cut(x,cutpoints,include.lowest=TRUE)
summary(binned)
[-152,-61] (-61,-40] (-40,-23.9]
100 100 100
(-23.9,-10.2] (-10.2,2.86] (2.86,15.4]
100 100 100
(15.4,25.9] (25.9,44.1] (44.1,64.7]
100 100 100
(64.7,186]
100
As you can see, the last summary code gives you the number of x-values in each bin, (ie: 100 row values).
my Q:
How do you display the actual 100 x-values
inside every bin PLUS its x row # (or rowname)??
What is the actual R-code
to get a 3-column data frame, (cols: Bin, Rowname and Values)
structured like this?:
Bin Rowname Values
[-152,-61] [25] -78.2
[28] -82.1
[75] -99.7 etc.....
(-61,-40] [18]-45.0
[26]-68.4 etc....
thanks!