Questions tagged [binning]

binning is the process of grouping data into "bins" used in statistics and data analysis

Binning is the process of grouping data into "bins" used in statistics and data analysis. For details see also Data binning - Wikipedia, the free encyclopedia

684 questions
0
votes
1 answer

How to create for loop to calculate gini function for binned data in r?

I'm having some difficulties trying to calculate the gini coefficient using binned census data, and would really appreciate any help. My data looks a little something like this (but with 14,000 observations of 13 variables). location <-…
Sarlo
  • 3
  • 4
0
votes
1 answer

How to bin values in the mapper?

I'm new to Hadoop MapReduce and I've recently encountered a problem in how to do the binning of output values in the mapper. My mapper creates a Text, IntWritable output with a dataset ID as a key and a length of metadata description as a value. My…
simtim
  • 231
  • 2
  • 14
0
votes
1 answer

How to control the CutPoints(while performing supervised binning) in R

I am using the 'discretization' package of R. While finding the cut points I am getting the following result. Command : discretization::cutPoints(data3$Dist_to_Stream, data3$Malaria_w3) where Dist_to_Stream is a variable of numeric values and…
silk_route11
  • 324
  • 3
  • 17
0
votes
2 answers

In Python, what would be a clear, efficient way to count things in regions?

I am looping over objects called events. Each event has a particular object in it. I am calculating the fraction of objects that have a particular characteristic. Imagine the approach as being something like the following: for event in events: …
d3pd
  • 7,935
  • 24
  • 76
  • 127
0
votes
1 answer

ggplot axis ticks fall at center of bin value rather than at the bin limits

I have data (x,y) for dates (x) and y (values). I split y into bins by value and am plotting one point for each bin on each day, with the size of the point proportional to the number of values that fall into each bin for that day. Each day now has 5…
epi_bio
  • 95
  • 1
  • 9
0
votes
1 answer

Binning values in a vector

I'm trying to 'bin' numbers which are beats per minute (BPM) into heart rate; the number of BPMs per time. I'm trying to keep the most similar consecutive numbers together as 1 heart rate. For example, if the BPM was x <- c(15.1, 15.2, 15.3, 20.1,…
Quigg
  • 11
  • 1
  • 2
0
votes
1 answer

Binning time series data and removing multiples within bin

2 columns in Sql Server Mgmt studio out of many are ID and DateTime. There are repetitions of some rows, and some the datetime values have varying frequencies...i.e, for 1 ID there may be 3 repeated datetime values , and then the next two for that…
0
votes
1 answer

R code to show the actual continuous values stored in each bin?

For a simple example, to "bin" 1000 (continuous value) datapoints in 10 bins (categories), with 100 datapoints in each bin: x <- rnorm(1000, mean=0, sd=50) # Next, let's say we want to create ten bins # with equal number of observations (100), in…
user39150
  • 103
  • 8
0
votes
1 answer

Binning longitude/latitude labeled data by census block ID

I have two data sets, one for crime in Chicago, labeled with longitude and latitude coords and a shapefile of census blocks also in Chicago. Is it possible in R to aggregate crimes within census blocks, given these two files? The purpose is to be…
Patrick Williams
  • 694
  • 8
  • 22
0
votes
2 answers

Reducing the size of an array by averaging points within the array (IDL)

While I am sure there is an answer, and this question is very low-level (but it's always the easy things that trip you up), my main issue is trying to word the question. Say I have the following arrays: time=[0,1,2,3,4,5,6,7,8,9,10,11] ;in…
0
votes
2 answers

Python - binning

I'm generating a series of values and would like to bin them. I'd rather not use numpy or the like. Is there something more pythonic than: bins = [20,30,40] results = [0,0,0,0] for _ in range(iterations): x = somefunction() for n, bin in…
foosion
  • 7,619
  • 25
  • 65
  • 102
0
votes
1 answer

Producing a contour plot from binned data

I'm looking to produce a contour plot from data that I've binned. I have two columns, one that represents the mass of a compound and another is its pearson correlation coefficient value. This is a small example of what I've done so far:- column1 <-…
user2062207
  • 955
  • 4
  • 18
  • 34
0
votes
1 answer

Importing sqlite file into r

I'm trying to import a file back into r which contains four different tables. I basically carried out this solution to a previous question and managed to get the connection established. However, I just want to use one of the tables from the sql file…
user2062207
  • 955
  • 4
  • 18
  • 34
0
votes
2 answers

How do I add ranges to a vector in R?

I'm trying to create a bin that looks like "<18, 18-24, 25-34, 35-44, 45-54, 55-64, 65+". I'm able to create evenly spaced ranges (25-34, 35-44...65+) but I can't figure out how to add the first two ranges (<18, 18-24). Here's the code that I…
enson
  • 1
0
votes
1 answer

Binning Pattern-Hadoop Mapreduce

I am new to Hadoop-Mapreduce concepts.I tried to implement the binning pattern using MapReduce but couldnot get a desired output. Here is my Binning Mapper Code:- public class BinningMapper extends Mapper { …