2

In the icCube Builder ETL, I want to group the data on more than one field. Also, as aggregation function, I would like to make use of MAX and MIN.

Example data:

sample data to group by on groupId and phase

(same data in text)

groupId phase startDate endDate 100 start 1-May-2018 5-May-2018 100 start 4-May-2018 7-May-2018 100 start 28-Apr-2018 1-May-2018 100 middle 4-May-2018 11-May-2018 100 middle 1-May-2018 10-May-2018 100 end 12-May-2018 15-May-2018 100 end 11-May-2018 13-May-2018 100 end 13-May-2018 14-May-2018 100 end 9-May-2018 12-May-2018 200 start 4-Apr-2018 2-May-2018 200 middle 18-Apr-2018 3-May-2018 200 middle 1-May-2018 1-May-2018 300 end 21-Apr-2018 24-Apr-2018

I would like to group this data on groupId and phase and get the minimum startDate and the maximum endDate:

enter image description here

How to best do that in the icCube ETL?

ic3
  • 7,917
  • 14
  • 67
  • 115
Arthur
  • 1,692
  • 10
  • 14

1 Answers1

1

We're adding a new version of groupBy View in the ETL layer to support this. However you can create a Java view to perform the groupBy.

Something like :

package iccube.pub;
import java.util.*;
import java.lang.*;
import org.joda.time.*;
import crazydev.iccube.pub.view.*;

public class CustomJavaView implements IOlapBuilderViewLogic
{
private Map<List<Comparable>,List<Agg>> cached;

public CustomJavaView()
{
}

public void onInitMainTable(Map<String, IOlapCachedTable> cachedTables, IOlapDataTableDef mainTable)
{
    cached = new HashMap();
}

public boolean onNewRow(IOlapViewContext context, Map<String, IOlapCachedTable> cachedTables, IOlapDataTableDef mainTable, IOlapReadOnlyDataRow mainTableRow)
{
    // create the groupby key (list of values) 
    final List<Comparable> groupBy = Arrays.asList(mainTableRow.get("phase"), mainTableRow.get("groupId"));

    // get the aggregators for values for the keys, build them if not already there
    final List<Agg> aggs = cached.computeIfAbsent(groupBy, key -> Arrays.asList(new Agg(true), new Agg(false)));
    // add values
    aggs.get(0).add(mainTableRow.getAsDateTime("startDate"));
    aggs.get(1).add(mainTableRow.getAsDateTime("endDate"));

    return true; // false to stop
}

public void onProcessingCompleted(IOlapViewContext context, Map<String, IOlapCachedTable> cachedTables)
{
    // now we can fire rows
    for (Map.Entry<List<Comparable>, List<Agg>> entry : cached.entrySet())
    {
        final List<Comparable> groupByKey = entry.getKey();
        final List<Agg> aggs = entry.getValue();

        // create empty row
        final IOlapDataTableRow row = context.newRow();
        row.set("phase",groupByKey.get(0));
        row.set("groupId",groupByKey.get(1));
        row.set("startDate",aggs.get(0).date);
        row.set("endDate",aggs.get(1).date);

        context.fireRow(row);
    }
}

// this is the Aggregator, you could implement something more complicated
static class Agg
{
    final int isMin;

    LocalDateTime date;

    Agg(boolean isMin)
    {
        this.isMin = isMin ? -1 : 1;
    }

    void add(LocalDateTime ndate)
    {
        if (ndate != null)
        {
            date = ( date!= null && ((date.compareTo(ndate) * isMin) > 0)) ? date : ndate;
        }
    }

}
}
ic3
  • 7,917
  • 14
  • 67
  • 115
  • Will check it out. Thanks – Arthur May 16 '18 at 08:30
  • It runs (compiles), but it does not produce the correct results. For example: groupId->start gives incorrect results for both MIN and MAX. I will try to find the solution, but I am no Java expert. – Arthur May 16 '18 at 08:53
  • Arthur, that should be really an easy fix :-) – ic3 May 16 '18 at 09:12
  • To spot colors one must first.see, and I am pretty blind with regard to Java ... ;-) I try to find the logic in the code, and I understand (I think) what is happening, but I can not spot where it goes wrong. My bet it is in the Agg.add routine, last line. But I can not debug or change it in the dataview. I guess I need to step into Java (the land of the seeing) first, which is a big jump for me ... if you can spell it to me where it needs to be changed, you would be my hero. – Arthur May 16 '18 at 09:31
  • you've a new one ;-) – ic3 May 16 '18 at 09:35