I've got some crossfilter data with dates (d
) and values (v
):
[
{d: "2013-07-26T00:00:00.000Z", v: 2.5}
{d: "2013-07-25T00:00:00.000Z", v: 2.64}
// ...and many more
[
I've created a group for the months in Crossfilter (crossfilter2@1.4.5
):
months = cf.dimension((d) => {
const dateObj = new Date(d.d);
// use 1-12 instead of 0-11
return dateObj.getMonth() + 1;
});
monthsGroup = months.group();
So monthsGroup.all()
returns an array of 12 objects, aggregated by month. I want those objects to include the min, max, and median, as well as the 25th and 75th percentile. Reductio (reductio@0.6.3
) helps with the min, max, and median out of the box, so I've added a custom aggregator to add the 75th and 25th percentiles.
The following code works, but it's very slow:
const monthReducer = reductio()
.valueList(d => d.v)
.min(true)
.max(true)
.median(true)
.count(true)
.custom({
add(p) {
const valueList = p.valueList;
p.p75 = getPercentile(valueList, 75);
p.p25 = getPercentile(valueList, 25);
return p;
},
remove(p) {
const valueList = p.valueList;
p.p75 = getPercentile(valueList, 75);
p.p25 = getPercentile(valueList, 25);
return p;
},
initial(p) {
p.p75 = undefined;
p.p25 = undefined;
return p;
},
});
If I remove the .custom
block, it's much faster. This runs the code for each item in the data
, which is unnecessary because it only needs to look at the final valueList
. Reductio has a barely-documented .post()
hook that I think would do the trick here, but I can't get it working.
UPDATE: I got the post-processing hook callback to run, but it doesn't work the way I expected.
I tried registering a new post processor with an undocumented method I saw in the source:
// register post-processing function to add percentiles
reductio.registerPostProcessor('addPercentiles', (prior) => {
const all = prior();
return () => {
const updated = all.map((e) => {
const valueList = e.value.valueList;
e.value.p75 = getPercentile(valueList, 75);
e.value.p25 = getPercentile(valueList, 25);
return e;
});
return updated;
};
});
and adding it to the post()
hook:
// run post-processing to add the 25th & 75th %iles
this.monthsGroup.post().addPercentiles()();
This appears to do what I want, but only once. It doesn't re-run the post hooks when a filter is applied to another dimension.
If median is just the 50th percentile, it should be trivial to also get the 25th and 75th. I feel like I'm close, but I'm obviously doing something wrong. How can I add these aggregations to the reductio reducer?