3

I have a similar situation as in this question . Considering the same dataset, how can I perform this functionality through crossfilters. I am new to dc.js and crossfilter. I am trying to implement the bar and area plot as in this example. Even this example is using 1 date column. I am able to do it with the startdate only. However, my requirement is to filter datasets based on startdate and enddate. I could not found many resources that talk about the same issue.

Any assistance and suggestions will be highly appreciated.

Community
  • 1
  • 1
user3050590
  • 1,656
  • 4
  • 21
  • 40
  • 1
    You'll need to create a dimension including both start and end date for each record and then implement a custom filter function. Please put together an example using just startdate or enddate on jsFiddle or a similar editable platform, and I or someone else here will be able to show you how to do this. – Ethan Jewett Feb 05 '16 at 16:11
  • @EthanJewett, is it a `groupAll`? :-) – Gordon Feb 05 '16 at 16:26
  • @Gordon LOL :-) It depends. Usually not. Usually just a filterFunction that checks the start/end dates and determines if the record is "active" during the selected period. But for this particular question, groupAll is also required because we want to then bucket those records into months and a record could fall into multiple months. – Ethan Jewett Feb 05 '16 at 16:30

2 Answers2

4

You might expect this to be simple, but actually keeping track of intervals is one of the classic tricky problems of computer science and it requires a specialized data structure called an interval tree to do it properly.

It's a pretty common request, so out of curiosity, I looked for a JavaScript library for interval trees and found one by Mikola Lysenko.

I've incorporated it into a new example here. (source)

The important parts of the example are, first, to use groupAll to populate the interval tree:

      projectsPerMonthTree = ndx.groupAll().reduce(
          function(v, d) {
              v.insert(d.interval);
              return v;
          },
          function(v, d) {
              v.remove(d.interval);
              return v;
          },
          function() {
              return lysenkoIntervalTree(null);
          }
      )

Next we populate a fake group using the start and end dates, counting all the intervals which intersect with each month:

  function intervalTreeGroup(tree, firstDate, lastDate) {
      return {
          all: function() {
              var begin = d3.time.month(firstDate), end = d3.time.month(lastDate);
              var i = new Date(begin);
              var ret = [], count;
              do {
                  next = new Date(i);
                  next.setMonth(next.getMonth()+1);
                  count = 0;
                  tree.queryInterval(i.getTime(), next.getTime(), function() {
                      ++count;
                  });
                  ret.push({key: i, value: count});
                  i = next;
              }
              while(i.getTime() <= end.getTime());
              return ret;
          }
      };
  }

      projectsPerMonthGroup = intervalTreeGroup(projectsPerMonthTree.value(), firstDate, lastDate),

(This could probably be simpler and cheaper if we used lower-level access to the interval tree, or if it had a richer API that allowed walking the intervals in order. But this should be fast enough.)

Finally, we set a filterFunction so that we choose the intervals which intersect with a given date range:

  monthChart.filterHandler(function(dim, filters) {
      if(filters && filters.length) {
          if(filters.length !== 1)
              throw new Error('not expecting more than one range filter');
          var range = filters[0];
          dim.filterFunction(function(i) {
              return !(i[1] < range[0].getTime() || i[0] > range[1].getTime());
          })
      }
      else dim.filterAll();
      return filters;
  });

I've set it up so that it filters the month chart to show all projects which intersect with its own date range. If this is not desired, the groupAll can be put on the intervalDimension instead.

Gordon
  • 19,811
  • 4
  • 36
  • 74
  • thank you for your solution. Sorry for taking long time to reply on this issue. However, I have a concern about a condition of ret.length >100 which throws error in the intervalTreeGroup(). Does that have any significance ? My dataset size is around 1500. So, it only works when I comment that condition. Thanks again. – user3050590 Mar 07 '16 at 09:52
  • Ah, sorry, I had put that in for debugging, will remove it. Thanks for reporting it. – Gordon Mar 07 '16 at 15:24
1

The solution is quite simple really. Create two dimensions:

  1. by start time -startTimeDim
  2. by end time - endTimeDim

Now, to filter out intervals that intersect a given range - rangeStart and rangeEnd, apply the following:

  1. startTimeDim.filter([-Infinity, rangeEnd])
  2. endTimeDim.filter([rangeStart, Infinity])

This basically filters out intervals that start before the range ends & end before the range starts.

chemicalX
  • 31
  • 3