5

Im creating a histogram algorithm. Im following the solution offered here.

I want to simply count the number of times each value has occurred.

However I cant quite get the algorithm right. My code is:

var values = [2, 4, 6, 3, 3];

var val_max = 6;
var val_min = 2;

var num_bins = parseInt(val_max - val_min + 1);
console.log('num_bins is ', num_bins);

var bin_width = (val_max-val_min)/num_bins;
console.log('bin_width is ', bin_width);

var to_plot = [];

for (var i = 0; i < num_bins; i++) {
  to_plot.push(0);
}

for (var x = 0; x < values.length; x++) {

    var bin_idx = parseInt((values[x] - val_min) / bin_width); 

    to_plot[bin_idx] = to_plot[bin_idx] + 1; 
}

console.log('to_plot is ', to_plot);

If you look at the console logs, you'll see:

to_plot is  [1, 2, 1, 0, 0, NaN]

I want that last index to be "1". But the problem is for values close the the maximum value, bin_idx is out of range. How can I tweak this so that I would get the following results?

to_plot is  [1, 2, 1, 0, 1] 

The jsfiddle is here.

Community
  • 1
  • 1
Mark
  • 4,428
  • 14
  • 60
  • 116
  • Does this answer your question? [Binning an array in javascript for a histogram](https://stackoverflow.com/questions/37445495/binning-an-array-in-javascript-for-a-histogram) – Liam May 14 '20 at 07:39

4 Answers4

6

Here's what I would do:

const data = [2, 4, 6, 3, 3];

print(histogram(data, 1)); // [1, 2, 1, 0, 1]
print(histogram(data, 2)); // [3, 1, 1]
print(histogram(data, 3)); // [4, 1]
print(histogram(data, 4)); // [4, 1]
print(histogram(data, 5)); // [5]

function histogram(data, size) {
    let min = Infinity;
    let max = -Infinity;

    for (const item of data) {
        if (item < min) min = item;
        else if (item > max) max = item;
    }

    const bins = Math.ceil((max - min + 1) / size);

    const histogram = new Array(bins).fill(0);

    for (const item of data) {
        histogram[Math.floor((item - min) / size)]++;
    }

    return histogram;
}

function print(x) {
    console.log(JSON.stringify(x));
}

This works for non-integer values too.

Aadit M Shah
  • 72,912
  • 30
  • 168
  • 299
  • Fantastic solution. This works very well. It looks like i'll have to use a logarithmic scale as some values are far too large. So this works very well as ill be working with floats. Thanks! – Mark Mar 28 '16 at 20:27
1

I think your bin_width is wrong. Try this calculation instead:

var bin_width = (val_max - val_min) / (num_bins - 1);

That makes the bin_width == 1 which lets the rest of your code work.

David
  • 34,223
  • 3
  • 62
  • 80
1

Since the number of bins is equal to the number of integers between val_min and val_max, the bin_width is 1, not 0.8 as currently being calculated. You're basically counting integers here. Use this loop to generate the histogram:

for (var x = 0; x < values.length; x++) {
    to_plot[values[x] - val_min] ++;
}
Brent Washburne
  • 12,904
  • 4
  • 60
  • 82
0

For those interested in histogram outputs rather than on the implementation, the d3-array library provides a bin() method that is useful to build histograms.

The following is a thin wrapper on top of bin(), in Typescript, that adds some features and provides an output similar to Python's numpy.histogram.

/**
 * Computes histogram of an array of numbers.
 *
 * @remarks
 * Requires library d3-array.
 *
 * @param arr - The input array of numbers.
 * @param nBins - Optional number of bins desired. Auto if not passed.
 * @param domain - Optional minimum and maximum bin edge values.
 * @param clamp - Whether to clamp values in arr to the domain.
 *    Only relevant if domain is passed. If clamp is false (default),
 *    data outside the domain will be lost. If clamp is true, the data will
 *    be kept in the first or last bin.
 * @returns An object with keys hist, containing the count of values in
 *    each bin, and binEdges, containing the edges of each bin.
 */
export const computeHistogram = (
  arr: number[],
  nBins?: number,
  domain?: [number, number],
  clamp: boolean = false
) => {
  if (domain && clamp) {
    const [domainMin, domainMax] = domain;
    arr.forEach((val, i, arr) => {
      arr[i] = Math.min(Math.max(val, domainMin), domainMax);
    });
  }
  let bins = d3Array.bin();
  if (domain) bins = bins.domain(domain);
  if (nBins) bins = bins.thresholds(nBins);
  const d3Hist = bins(arr);
  const hist: number[] = d3Hist.map((item) => item.length);
  const binEdges = d3Hist.map((item) => item.x0);
  if (d3Hist.length) binEdges.push(d3Hist.at(-1)!.x1);

  return { hist, binEdges };
};
swimmer
  • 1,971
  • 2
  • 17
  • 28