Questions tagged [amazon-kinesis]

Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale.

Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. Amazon Kinesis can collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data.

With Amazon Kinesis applications, you can build real-time dashboards, capture exceptions and generate alerts, drive recommendations, and make other real-time business or operational decisions. You can also easily send data to a variety of other services such as Amazon Simple Storage Service (Amazon S3), Amazon DynamoDB, or Amazon Redshift. In a few clicks and a couple of lines of code, you can start building applications which respond to changes in your data stream in seconds, at any scale, while only paying for the resources you use.

Useful links

1802 questions
13
votes
1 answer

Reliability issues with Checkpointing/WAL in Spark Streaming 1.6.0

Description We have a Spark Streaming 1.5.2 application in Scala that reads JSON events from a Kinesis Stream, does some transformations/aggregations and writes the results to different S3 prefixes. The current batch interval is 60 seconds. We have…
13
votes
2 answers

How to decide total number of partition keys in AWS kinesis stream?

In a producer-consumer web application, what should be the thought process to create a partition key for a kinesis stream shard. Suppose, I have a kinesis stream with 16 shards, how many partition keys should I create? Is it really dependent on the…
shivba
  • 181
  • 1
  • 2
  • 6
12
votes
1 answer

Amazon Kinesis vs AWS Manage Service Kafka (MSK) - (Connect from on-prem)

I'm evaluating AWS Kinesis vs Managed Service Kafka (MSK). Our requirement is sending some messages (JSON) to AWS to from the on-prem system (system develop using c++). Then we need to persist above messages into the relational database like…
HASH
  • 123
  • 1
  • 5
12
votes
2 answers

Boto3 Kinesis Video GetMedia and OpenCV

I'm trying to use Boto3 to get a video stream from kinesis and then use OpenCV to display the feed and save it to a file at the same time. The process of getting the signed URL and then the Getmedia request seems to work perfectly it's just when I'm…
12
votes
2 answers

Multiple Destinations for Kinesis

Can we have multiple destinations from single Kinesis Firehose? I saw this picture From this, it looks like it is possible to add s3, redshift and elastic search from single firehose. I exactly want to do this. But when I do it from aws console,…
hatellla
  • 4,796
  • 8
  • 49
  • 101
12
votes
4 answers

Processing DynamoDB streams using the AWS Java DynamoDB streams Kinesis adapter

I'm attempting to capture DynamoDB table changes using DynamoDB streams and the AWS provided Java DynamoDB streams Kinesis adapter. I'm working with the AWS Java SDKs in a Scala app. I started by following the AWS guide and by going through the AWS…
francis
  • 5,889
  • 3
  • 27
  • 51
12
votes
3 answers

Stream data from MySQL Binary Log to Kinesis

We have a write-intensive table (on AWS RDS MySQL) from a legacy system and we'd like to stream every write event (insert or updated) from that table to kinesis. The idea is to create a pipe to warmup caches and update search engines. Currently we…
David Lojudice Sb.
  • 1,302
  • 1
  • 15
  • 26
12
votes
2 answers

Explain Kinesis Shard Iterator - AWS Java SDK

OK, I'll start with an elaborated use-case and will explain my question: I use a 3rd party web analytics platform which utilizes AWS Kinesis streams in order to pass data from the client into the final destination - a Kinesis stream; The web…
Yuval Herziger
  • 1,145
  • 2
  • 16
  • 28
11
votes
3 answers

Schema registry on AWS

I'm evaluating kinesis as replacement for kafka. One of the things I'm missing is Schema registry equivalent solution. In particular I need: schema upgrade - validate compatibility with the previous version version avro schemas in a similar way as…
czajek
  • 714
  • 1
  • 9
  • 23
11
votes
1 answer

Using the partition key in Kinesis to guarantee that records with the same key are processed by the same record processor (lambda)

I am working on a real-time data pipeline with AWS kinesis and lambda and I am trying to figure out how I can guarantee that records from the same data producers are processed by the same shard and ultimately by the same lambda function instance. My…
oneschilling
  • 463
  • 5
  • 11
11
votes
3 answers

Amazon Kinesis: Caught exception while sync'ing Kinesis shards and leases

I am trying to make Snowplow work on AWS. When I am trying to run stream-enrich service on instance, I am getting this exception: [main] INFO com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker - Syncing Kinesis shard info [main] ERROR…
Prakhar Mishra
  • 1,586
  • 4
  • 28
  • 52
11
votes
6 answers

Partition Kinesis firehose S3 records by event time

Firehose->S3 uses the current date as a prefix for creating keys in S3. So this partitions the data by the time the record is written. My firehose stream contains events which have a specific event time. Is there a way to create S3 keys containing…
11
votes
1 answer

AWS Lambda Limits when processing Kinesis Stream

Can someone explain what happens to events when a Lambda is subscribed to Kinesis item create events. There is a limit of 100 concurrent requests for an account in AWS, so if 1,000,000 items are added to kinesis how are the events handled, are they…
Ryan Fisch
  • 2,614
  • 5
  • 36
  • 57
10
votes
1 answer

What is the difference between AWS Kinesis and EventBridge

I'm an AWS noob, I'm trying to figure out what the difference between Amazon's Kinesis Data Stream and EventBridge products. Can someone explain this for someone not familiar with the AWS tech stack?
10
votes
1 answer

How to build and use flink-connector-kinesis?

I'm trying to use Apache Flink with AWS kinesis. The document says that I have to build the connector on my own. Therefore, I build the connector and added the jar file for my project and also, I put the dependency on my pom.xml file.