1

I'm trying to downsample an array that has ~10,000,000 points based on the current zoom state.

The downsampling mechanism is through downsample module, using LTTB function; below is a simple example of how it's done crossfilter

const ARRAY_LENGTH = 1e7;
const MAX_VAL = 40;
const randomArray = [];
for (let i = 0; i < ARRAY_LENGTH; i += 1) {
  randomArray.push({ x: i, y: Math.floor(Math.random() * MAX_VAL) });
}
export { randomArray };

const chartWidth = 10000;
export const downsampledData = LTTB(randomArray, chartWidth);
export const downsampledDataArray = downsampledData as Array<{
  x: number;
  y: number;
}>;

Now I'm trying to figure out how to implement LTTB using crossfilter, so I can use zoom callback to only render the 10,000 points within the zoom window.

The simplified dc.js code is as follows:

  const xy = crossfilter(downsampledDataArray);
  const dimension = xy.dimension((d) => [d.x, d.y]);

  const group = dimension.group();

  const chart = dc
    // eslint-disable-next-line @typescript-eslint/ban-ts-comment
    // @ts-ignore
    .lineChart(divRef) // @type error
    .margins({ top: 10, right: 50, bottom: 50, left: 60 })
    .x(d3.scaleLinear().domain([0, 1e7]))
    .yAxisLabel('Photons/s')
    .xAxisLabel('Time')
    .keyAccessor((d) => {
      return d.key[0];
    })
    .valueAccessor((d) => {
      return d.key[1];
    })
    .clipPadding(10)
    .renderArea(false)
    .dimension(dimension)
    .mouseZoomable(true)
    // .excludedColor('#ddd')
    .group(group);

Here's a stackblitz with chart rendering using the above pre-downsampled data

Gordon
  • 19,811
  • 4
  • 36
  • 74
joshp
  • 706
  • 5
  • 22
  • 1
    [This example](https://dc-js.github.io/dc.js/examples/focus-dynamic-interval.html) demonstrates switching in and out groups at different levels of aggregation based on a threshold, to display an appropriate number of points. Might get you partway! – Gordon Feb 22 '22 at 15:46
  • @Gordon thanks for the link, definitely super helpful! I'm getting hung up on `function make_group (interval) { return dimension.group(interval)...` as this seems to be where the custom group (and data density) is defined. I'm wondering if you have insight into how to provide LTTB as a grouping function? Is it even possible considering this algorithm uses local data context for redacting points? – joshp Feb 23 '22 at 00:53
  • You might use a [fake group](https://github.com/dc-js/dc.js/wiki/FAQ#fake-groups) for that, basically an object with `.all()` method that returns an array of key-value objects. You might run into performance problems trying to stuff that much data into crossfilter, so if you don't need dimensional filtering on this data, the fake group would be enough to feed the data into dc.js. – Gordon Feb 23 '22 at 12:50
  • @Gordon I updated the stackblitz, adapting what you said... however it's quite slow render, am I doing something wrong for the caching like you said in your other answer? https://stackoverflow.com/a/53636757/12728698 – joshp Feb 24 '22 at 00:16
  • You've got 9,999 points in that plot, so yes, that is going to be slow. The idea of the example is to downsample until you are at dozens of points. – Gordon Feb 24 '22 at 14:35

0 Answers0