1

I'm trying to integrate calcite with Kafka, I refrenced CsvStreamableTable.

Each ConsumerRecord is convert to Object[] using the fowlloing code:

static class ArrayRowConverter extends RowConverter<Object[]> {
    private List<Schema.Field> fields;

    public ArrayRowConverter(List<Schema.Field> fields) {
        this.fields = fields;
    }

    @Override
    Object[] convertRow(ConsumerRecord<String, GenericRecord> consumerRecord) {
        Object[] objects = new Object[fields.size()+1];
        int i = 0 ;
        objects[i++] = consumerRecord.timestamp();
        for(Schema.Field field : this.fields) {
            Object obj = consumerRecord.value().get(field.name());
            if( obj instanceof Utf8 ){
                objects[i ++] = obj.toString();
            }else {
                objects[i ++] = obj;
            }
        }
        return objects;
    }
}

Enumerator is implemented as following,one thread is constantly polling records from kafka and put them into a queue, getRecord() method poll from that queue:

public E current() {
    return current;
}

public boolean moveNext() {
for(;;) {
    if(cancelFlag.get()) {
        return false;
    }
    ConsumerRecord<String, GenericRecord> record = getRecord();
    if(record ==  null) {
        try {
            Thread.sleep(200L);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        continue;
    }
    current = rowConvert.convertRow(record);
    return true;
    }
}

I tested SELECT STREAM * FROM Kafka.clicks, it works fine. rowtime is the first column explicitly added,and the value is record Timestamp of Kafka.

But when I tried

SELECT STREAM FLOOR(rowtime TO HOUR) 
AS rowtime,ip,COUNT(*) AS c FROM KAFKA.clicks  GROUP BY FLOOR(rowtime TO HOUR), ip

It threw exception

java.sql.SQLException: Error while executing SQL "SELECT STREAM FLOOR(rowtime TO HOUR) AS rowtime,ip,COUNT(*) AS c FROM KAFKA.clicks  GROUP BY FLOOR(rowtime TO HOUR), ip": From line 1, column 85 to line 1, column 119: Streaming aggregation requires at least one monotonic expression in GROUP BY clause
    at org.apache.calcite.avatica.Helper.createException(Helper.java:56)
    at org.apache.calcite.avatica.Helper.createException(Helper.java:41)
user2283216
  • 81
  • 1
  • 3

1 Answers1

1

You need to declare that the "ROWTIME" column is monotonic. In MockCatalogReader, note how "ROWTIME" is declared monotonic in the "ORDERS" and "SHIPMENTS" streams. That’s why some queries in SqlValidatorTest.testStreamGroupBy() are valid and others are not. The key method relied up by the validator is SqlValidatorTable.getMonotonicity(String columnName).

Julian Hyde
  • 1,239
  • 7
  • 10
  • Thanks Julian, is there simple way to declare a column monotonic, or should I just implement as MockTable did? – user2283216 Feb 24 '17 at 03:45
  • @user2283216, according to this [code snipped](https://github.com/apache/calcite/blob/53e09688c71b85817a9c382edd573dbcc7e48aa5/core/src/main/java/org/apache/calcite/prepare/RelOptTableImpl.java#L362-L365) it should be enough to define the `rowtime` as the first column (e.g index=0) to ensure its monotonicity. – tzolov Oct 12 '21 at 20:40