0

I am using Flink CEP with SQL for processing events, the schema is determined by the JSON event, while trying to execute the given SQL it gives bellow exception can. we have defined DTO’s for the event (contains nested Dto's), does any one know what am I doing wrong. as basic query like SELECT message.event.size FROM Log works just fine.

Am I doing something wrong or Flink does not allow to reference nested objects with MATCH_RECOGNIZE

Event Example

{
  "message": {
    "@timestamp": "2023-08-02T12:12:16Z",
    "customFields": "some string",
    "cloud": {
      "region": "us-east-1"
    },
    "ecs": {
      "version": "8.0.0"
    },
    "event": {
      "action": "GetBucketAcl",
      "id": "bb7ee0ed-f273-4da8-a75d-53e10e7d1515",
      "kind": "event",
      "outcome": "success",
      "provider": "s3.amazonaws.com",
      "type": "info",
            "size": 10
    },
    "source": {
      "address": "cloudtrail.amazonaws.com"
    }
  },
  "id": "pxX_35all1pkteGtxXxOi",
  "timestamp": "2023-08-02 12:17:31.526356",
  "integration": "cloudtrail",
  "configuration": "conf-1"
}

SQL Query

SELECT * FROM Log
MATCH_RECOGNIZE (
    PARTITION BY integration
    ORDER BY time_ltz
    MEASURES
        FIRST(A.time_ltz) AS start_tstamp,
        LAST(A.time_ltz) AS end_tstamp,
        SUM(message.event.size) AS size,
        A.id AS aID,
        LAST(B.id) as bID,
        C.id AS cID
    AFTER MATCH SKIP PAST LAST ROW
    PATTERN (A B+ C) WITHIN INTERVAL '5' SECOND
    DEFINE
        A AS A.message.event.action = 'login',
        B AS SUM(message.event.size) > 30
) T;

Generated Schema

(
  `message` *org.example.Dto.Message<`cloud` *org.example.Dto.Cloud<`region` STRING>*, `customField` STRING, `ecs` *org.example.Dto.Ecs<`version` STRING>*, `event` *org.example.Dto.Event<`action` STRING, `id` STRING, `kind` STRING, `outcome` STRING, `provider` STRING, `type` STRING, `size` BIGINT NOT NULL>*, `source` *org.example.Dto.Source<`address` STRING>*, `timestamp` STRING>*,
  `id` STRING,
  `timestamp` STRING,
  `integration` STRING,
  `configuration` STRING,
  `time_ltz` TIMESTAMP(3)
)

StackTrace

Exception in thread "main" org.apache.flink.table.api.ValidationException: SQL validation failed. From line 15, column 22 to line 15, column 26: Table 'event.message.event' not found
    at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:187)
    at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:113)
    at org.apache.flink.table.planner.operations.SqlToOperationConverter.convert(SqlToOperationConverter.java:281)
    at org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:106)
    at org.apache.flink.table.api.internal.TableEnvironmentImpl.sqlQuery(TableEnvironmentImpl.java:738)
    at org.example.StaticRuleEvaluator.main(StaticRuleEvaluator.java:115)
Caused by: org.apache.calcite.runtime.CalciteContextException: From line 15, column 22 to line 15, column 26: Table 'event.message.event' not found
    at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

0 Answers0