0

I want to implement a simple date filter and I feel it is not as easy as I thought it is.

    DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss");
    Date date = new Date();
    String datestring = dateFormat.format(date);

    ExpressionFilter dateFilter = new ExpressionFilter("datefield1 <= datestring", String.class);
    inputPipe = new Each(inputPipe,dateFilter);

datefield1 is a field in the inputPipe which I want to filter on based on the current date. The problem with the above code is that it expects to find the fields mentioned in the ExpressionFilter to be present in the inputPipe. datestring is not a field in the inputPipe and hence it is failing there.

Also tried this way but it throws a compile error. I'm new to Cascading and Java, so please excuse if I miss anything.

    ExpressionFilter dateFilter = new ExpressionFilter("datefield1 <= "+datestring, String.class);
Blorgbeard
  • 101,031
  • 48
  • 228
  • 272
Vinay
  • 1,473
  • 4
  • 14
  • 24
  • Try using index of the field instead of name in inputPipe. Although I do not understand how string comparison would be similar to date comparison. – Amit Sep 01 '16 at 15:13
  • Look at the following solution: http://stackoverflow.com/a/36351176/2421561 – Ambrish Sep 04 '16 at 17:24

3 Answers3

1

You can look at the cascading Filter option

Create a filter like following

public class DateFilter extends BaseOperation implements Filter {
    private String dateStr;
    public DateFilter(String dateStr) {
        this.dateStr = dateStr;
    }

    public boolean isRemove( FlowProcess flowProcess, FilterCall filterCall ) {
        // get the arguments TupleEntry
        TupleEntry arguments = filterCall.getArguments();

        // initialize the return result
        boolean isRemove = false;

        String inputStr = argument.getString("datefield1"); // Get the date from datefield1 field

        isRemove = compareDate(inputStr, dateStr);

        return isRemove;
    }

    private boolean compareDate(String inputStr, String dateStr) {
        // Add you logic to match the date. Try [joda](http://www.joda.org/joda-time/)
        return false;
    }
}

Once you have the filter, use it in your code like:

DateFormat dateFormat = new SimpleDateFormat("yyyy/MM/dd HH:mm:ss");
Date date = new Date();
String datestring = dateFormat.format(date);

inputPipe = new Each(inputPipe, new DateFilter(datestring));

This should help you.

Reference:

Ambrish
  • 3,627
  • 2
  • 27
  • 42
1

Here is the simple example that can help you. The input file contains two fields "name" and "dob". The program filters all the invalid future dob. Input contains following data.

ABC, 2010-01-01
DEF, 2012-04-05
GHI, 2016-12-13
JKL, 2017-04-05
MNO, 2015-12-03
PQR, 2016-05-03

Here is the Filter

class DateFilter extends BaseOperation implements Filter{

    SimpleDateFormat f = new SimpleDateFormat("yyyy-MM-dd");
    @Override
    public boolean isRemove(FlowProcess flowProcess, FilterCall filterCall) {
        TupleEntry tupleEntry = filterCall.getArguments();
        String date = tupleEntry.getString("dateField1");
        Date dateField1 = null;
        try {
            dateField1 = f.parse(date);
        } catch (ParseException e) {
            e.printStackTrace();
        }
        if (dateField1.before(new Date()))
            return false;
        else
            return true;
    }

You can use it as

Pipe pipe = new Pipe("Pipe");
pipe = new Each(pipe, new DateFilter());

And the output is

name,dateField1
ABC, 2010-01-01
DEF, 2012-04-05
MNO, 2015-12-03
PQR, 2016-05-03
koiralo
  • 22,594
  • 6
  • 51
  • 72
0

All you need to do is return true for the rows you need to remove in isRemove function. Its upto you how you want to extract values. Pretty nice explanation in this link.

pramesh
  • 1,914
  • 1
  • 19
  • 30