pyflink aggfunction in window tvf can not sink connection='kafka', it notice consuming update changes

Question

data source:

class Sum0(AggregateFunction):

    def get_value(self, accumulator):
        return accumulator[0]

    def create_accumulator(self):
        return Row(0)

    def accumulate(self, accumulator, *args):
        if args[0] is not None:
            accumulator[0] += args[0]

    def retract(self, accumulator, *args):
        if args[0] is not None:
            accumulator[0] -= args[0]

    def merge(self, accumulator, accumulators):
        for acc in accumulators:
            accumulator[0] += acc[0]

    def get_result_type(self):
        return "BIGINT"

    def get_accumulator_type(self):
        return 'ROW<f0 BIGINT>'
ds = env.from_collection(
        collection=[(1, 2, "Lee", datetime.now() - timedelta(hours=4)),
                    (2, 3, "Lee", datetime.now() - timedelta(hours=4)),
                    (3, 4, "Jay", datetime.now() - timedelta(hours=4)),
                    (5, 6, "Jay", datetime.now() - timedelta(hours=2)),
                    (7, 8, "Lee", datetime.now())],
        type_info=Types.ROW([Types.INT(),
                            Types.INT(),
                            Types.STRING(),
                            Types.SQL_TIMESTAMP()]))

    table_schema = Schema.new_builder() \
        .column("f0", "INT") \
        .column("f1", "INT") \
        .column("f2", "STRING") \
        .column_by_expression("rowtime", "CAST(f3 AS TIMESTAMP(3))") \
        .watermark("rowtime", "rowtime - INTERVAL '1' SECOND") \
        .build()

    ts = table_env.from_data_stream(ds, table_schema) \
        .alias("value", "count", "name", "rowtime")

this main sql

insert into kafka_sink select name,sum_udf_agg(value) as agg_data from TABLE(TUMBLE(TABLE source, DESCRIPTOR(rowtime),INTERVAL '1' HOURS )) group by window_start, window_end, name

error_detail: pyflink.util.exceptions.TableException: org.apache.flink.table.api.TableException: Table sink 'default_catalog.default_database.kafka_test' doesn't support consuming update changes which is produced by node PythonGroupAggregate(groupBy=[window_start, window_end, name], select=[window_start, window_end, name, sum_udf(value) AS agg_data])

if i use sum in flink system, it will be ok ,I don't understand. Why?

Please share more about the error message you see, the structure of the data, etc. This will also help make the title clearer to attract more answers. — opeonikute, Aug 23 '23 at 10:11

pyflink aggfunction in window tvf can not sink connection='kafka', it notice consuming update changes

0 Answers0