Questions tagged [amazon-kinesis-analytics]

Amazon Kinesis Data Analytics is the way to analyze streaming data, gain actionable insights, and respond in real time. SQL users can query streaming data or build entire streaming applications using templates and an interactive SQL editor. Java developers can build streaming applications using open source Java libraries and AWS integrations to transform and analyze data in real-time.

133 questions
2
votes
0 answers

Flink processing events too slow

I am using Kinesis data stream as a source and elasticsearch as a sink. Running Flink job in AWS Kinesis Data analytics application. Sample event : {"area":"sessions","userId":4450,"date":"2021-12-03T11:00:00","videoDuration":5} I am collecting…
2
votes
0 answers

Changing CloudWatch log output from a "Kinesis Data Analytics for Apache Flink" app

Does anyone know how to change the CloudWatch log output from a "Kinesis Data Analytics for Apache Flink" app.? There are two things I'd like to change: The fields in the JSON written to CloudWatch The contents/format of the "message" field (i.e.,…
2
votes
1 answer

Flink checkpoints size are growing over 20GB and checkpoints time take over 1 minute

First and foremost: I'm kind of new to Flink (Understand the principle and is able to create any basic streaming job I need to) I'm using Kinesis Analytics to run my Flink job and by default it's using incremental checkpointing with a 1 minute…
2
votes
0 answers

AWS Kinesis concatenting fields on aggregation

When using AWS Kinesis data analytics, is it possible to concatenate a field from the input rows when aggregating events in a stagger window into a single output row? And if so how can we go about implementing this concatenation during the…
2
votes
0 answers

AppSync integration with Kinesis

I have an use-case to process and aggregate real-time data using Kinesis data analytics. Is it possible to publish data to kinesis streams directly from Appsync (without an intermediate lambda) ? And also to trigger subscription back from kinesis…
2
votes
3 answers

How to merge Kinesis data streams into one for Kinesis data analytics?

I have multiple AWS kinesis data streams/firehose with structured data in CSV format. I need to perform analytics on that data with kinesis data analytics. But how can I merge multiple streams into one? Because Kinesis data analytics gets data only…
2
votes
1 answer

How to share a cache in Flink kinesis stream

I've been using Flink and kinesis analytics recently. I have a stream of data and also I need a cache to be shared with the stream. To share the cache data with the kinesis stream, it's connected to a broadcast stream. The cache source extends…
2
votes
0 answers

kinesis analytics flink write parquet file

Using amazon kinesis analytics with a java flink application I am taking data from a firehose and trying to write it to a S3 bucket as a series of parquet files. I am hitting the following exception in my cloud watch logs which is the only error I…
J T
  • 337
  • 1
  • 3
  • 14
2
votes
1 answer

Adding constant value to stream in Kinesis Analytics application

In my Kinesis Analytics application I want to add a constant string to my output stream. For example: CREATE OR REPLACE STREAM "DESTINATION_SQL_STREAM" ( "constant_column" varchar(100), "feature" varchar(246) …
Farseer
  • 4,036
  • 3
  • 42
  • 61
1
vote
0 answers

AWS Kinesis Firehose, Transformation Lambda and OpenSearch log processing

I'm having some issues publishing the logs from CloudWatch -> Kinesis Stream -> Kinesis Delivery Stream -> Transformation Lambda -> AWS OpenSearch. While the documentation is straightforward i struggle with the transformation lambda and insertion to…
1
vote
1 answer

Flink operation does not distribute the incoming messages equally to all subtasks

I have a Java Flink (version 1.15) application with an Async I/O operation running in AWS Kinesis Flink Runtime with parallelism set to 12. The operation reads a stream of messages from a FlinkKinesisConsumer and processes it in an async…
Shankar
  • 2,625
  • 3
  • 25
  • 49
1
vote
1 answer

Can I avoid network shuffle when creating a KeyedStream from a pre-partitioned Kinesis Data Stream in Apache Flink?

Is it possible to create a KeyedStream from a pre-sharded/pre-partitioned Kinesis Data Stream without the need for a network shuffle (i.e. using reinterpretAsKeyedStream or something similar)? If that is not possible (i.e. the only reliable is to…
r_g_s_
  • 224
  • 1
  • 8
1
vote
0 answers

Ingesting data from Kinesis Streams using Apache Flink (AWS KDA)

I am getting this error while reading data from Kinesis Streams (4 shards) with Flink. Apparently by default, Flink subscribes itself to topic every 5mins and I set parallelism to 3. I also checked this metric Read throughput exceeded - average…
1
vote
1 answer

Is it possible to consume Firehose as input to Apache Flink on AWS Kinesis Data Analytics?

Like AWS Kinesis Data Analytics SQL(legacy), Is it possible for the flink application on KDA to consume as firehose input stream ?
1
vote
0 answers

How to slow down reads in Kinesis Consumer Library?

We have an aggregation system where the Aggregator is an KDA application running Flink which Aggregates the data over 6hrs time window and puts all the data into AWS Kinesis data Stream. We also have an Consumer Application that uses KCL 2.x library…
1
2
3
8 9