Questions tagged [amazon-kinesis-analytics]

Amazon Kinesis Data Analytics is the way to analyze streaming data, gain actionable insights, and respond in real time. SQL users can query streaming data or build entire streaming applications using templates and an interactive SQL editor. Java developers can build streaming applications using open source Java libraries and AWS integrations to transform and analyze data in real-time.

133 questions
0
votes
0 answers

AWS Kinesis Data Analytics (Apache Flink java app) crashing randomly on Kinesis inputstream RuntimeException

I have three different Apache Flink applications running on 3 different AWS kinesis data analytics which are consuming data from the same kinesis input stream. After a while I can see all the flink randomly crashed and restarts with the error…
0
votes
1 answer

Kinesis data stream skipping records with exception at downstream

I have an application with the following set up. Kinesis data stream (retention period: 1day) -> StreamExecutionEnvironment .getExecutionEnvironment() .addSource(new FlinkKinesisConsumer) .map(new MapFunction()) .addSink(); When the MapFunction…
0
votes
1 answer

kinesis data stream performance testing with partition key

I am using the Kinesis Data Generator tool and I was wondering how to define the partition key in the test data so that the data is distributed to all the shard evenly. https://awslabs.github.io/amazon-kinesis-data-generator/web/producer.html
zimmerdimmer
  • 39
  • 1
  • 5
0
votes
1 answer

Cannot connect Flink to Elasticache Redis cluster - FlinkJedisClusterConfig unable to parse cport in CLUSTER NODES response

How can I use an Elasticache Redis Replication Group as a data sink in Flink for Kinesis Analytics? I have created an Elasticache Redis Replication Group, and would like to compute something in Flink and store the results in this group. My Java…
0
votes
1 answer

ClassNotFoundException while running jar on Amazon Kinesis Streaming Analytics app

I have created a Kinesis Analytics Streaming Application in SpringBoot which will consume messages from the AmazonKinesis input stream and will do some operations on top of it using the Apache Flink DataStream library. When, I am uploading the…
Jay
  • 111
  • 1
  • 4
  • 13
0
votes
2 answers

how to join two data streams along with sliding window function in Flink Table API?

I have two streaming tables from two Kafka topic and I want to join these streams and perform aggregate function on the data joined. Streams need to be joined using sliding window. On joining and windowing the data, I am getting an error Rowtime…
0
votes
1 answer

Why does my watermark not advance in my Apache Flink keyed stream?

I am currently using Apache Flink 1.13.2 with Java for my streaming application. I am using a keyed function with no window function. I have implemented a watermark strategy and autoWatermarkInterval config per the documentation, although my…
Ryan
  • 720
  • 1
  • 8
  • 27
0
votes
1 answer

Recalculate historical data using Apache Beam

I have an Apache Beam streaming project that calculates data and writes it to the database, what is the best way to reprocess all historical records after a bug fix or after changing the way it processes data without a big delay?
0
votes
1 answer

Combine two keys in Apache Beam

I have an Apache Beam streaming project that uses Combine.perKey(), I need to be able to merge entities from my admin tool (to point one entity to another one), how to combine two keys with calculated data in Beam? It's easy to do it for the new…
0
votes
1 answer

I have configured my Flink Application using PyFlink, but I want to change the Job Name

I have configured Amazon Kinesis Data Analytic using PyFlink, but I want to change the Job Name to whatever I want. How can I do this?
0
votes
1 answer

Kinesis Firehose Lambda Transformation and Dynamic partition

The following data presented is from the faker library. i am trying to learn and implement dynamic partition in kinesis Firehose Sample payload Input { "name":"Dr. Nancy Mcmillan", "phone_numbers":"8XXXXX", "city":"Priscillaport", …
0
votes
1 answer

Apache Flink StreamingFileSink making several HEAD requests while writing to S3 which causes ratelimiting

I have an Apache Flink application that I have deployed on Kinesis Data analytics. This application reads from Kafka and writes to S3. The S3 bucket structure it writes to is computed using a BucketAssigner.A stripped down version of the…
Vinod Mohanan
  • 3,729
  • 2
  • 17
  • 25
0
votes
0 answers

Heavy back pressure and huge checkpoint size

I have an Apache Flink application that I have deployed on Kinesis Data analytics. Payload schema processed by the application (simplified version): { id:String= uuid (each request gets one), category:string= uuid (we have 10 of…
Vinod Mohanan
  • 3,729
  • 2
  • 17
  • 25
0
votes
2 answers

How to update/refresh a parameter in Flink application

I have a Flink application on AWS Kinesis Analytics service. I need to filter some values on a data stream based on a threshold. Also, I'm passing the threshold parameter using AWS Systems Manager Parameter Store service. For now, I got this: In my…
0
votes
1 answer

Flink - DynamoDB source

I'm new working with real-time applications. Currently, I'm using AWS Kinesis/Flink and Scala I have the following architecture: old architecture As you can see I consume a CSV file using CSVTableSource. Unfortunately, the CSV file became too big…
1 2 3
8 9