Questions tagged [amazon-kinesis]

Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale.

Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. Amazon Kinesis can collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data.

With Amazon Kinesis applications, you can build real-time dashboards, capture exceptions and generate alerts, drive recommendations, and make other real-time business or operational decisions. You can also easily send data to a variety of other services such as Amazon Simple Storage Service (Amazon S3), Amazon DynamoDB, or Amazon Redshift. In a few clicks and a couple of lines of code, you can start building applications which respond to changes in your data stream in seconds, at any scale, while only paying for the resources you use.

Useful links

1802 questions
7
votes
1 answer

Kinesis Shards VS Partition Key

If at the time of creation of a Kinesis data stream I specify the number of shards to be let's say 10, and every time I put record I assign it a random Partition key like this: var putRecord = new PutRecord { Data = data…
Sameed
  • 655
  • 1
  • 5
  • 18
7
votes
1 answer

How to handle reprocessing scenarios in AWS Kinesis?

I am exploring AWS Kinesis for a data processing requirement that replaces old batch ETL processing with a stream based approach. One of the key requirements for this project is the ability to reprocess data in cases when A bug is discovered and…
Rahul
  • 12,886
  • 13
  • 57
  • 62
7
votes
2 answers

How can I join a spark live stream with all the data collected by another stream during its entire life cycle?

I have two spark streams, in the first comes data related to products: their price to the supplier, the currency, their description, the supplier id. These data are enriched by the category, guessed by the analysis of the description and the price…
7
votes
0 answers

Kinesis Shard GetRecords.IteratorAgeMilliseconds reached maximum 86.4M (1 day) and does not decrease even though consuming

I am consuming a Kinesis stream with Spark Streaming 2.2.0 and using spark-streaming-kinesis-asl_2.11. Kinesis Stream has 150 shards and I am monitoring GetRecords.IteratorAgeMilliseconds CloudWatch metric to see whether consumer is keeping up with…
Grega Kešpret
  • 11,827
  • 6
  • 39
  • 44
7
votes
1 answer

Spark Streaming Guarantee Specific Start Window Time

I'm using Spark Streaming to read data from Kinesis using the Structured Streaming framework, my connection is as follows val kinesis = spark .readStream .format("kinesis") .option("streams", streamName) .option("endpointUrl", endpointUrl) …
7
votes
2 answers

Kinesis Lambda Consumer Minimum Batch Size

I'm using AWS Lambda (node.js) as a AWS Kinesis Consumer. I can see that you can set a maximum batch size, but I'm wondering if I can set a minimum batch size. So that I can insure that each lambda will handle at least 50 (or any number)…
7
votes
1 answer

What exactly does sequenceNumberForOrdering do when putting records into a Kinesis stream with the Java SDK?

I'm a bit confused about the AWS docs for putting records to Kinesis stream here: https://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecord.html#API_PutRecord_RequestSyntax It says that setting sequenceNumberForOrdering should be used…
EagleBeak
  • 6,939
  • 8
  • 31
  • 47
7
votes
1 answer

How to use ExplicitHashKey for round robin stream assignment in AWS Kinesis

I am trying to pump lots of data through Amazon Kinesis (order 10,000 points per second). In order to maximize records per second through my shards, I'd like to round robin my requests over the shards (my application logic doesn't care what shard…
deadcode
  • 2,226
  • 1
  • 20
  • 29
7
votes
1 answer

Kafka like offset on Kinesis Stream?

I have worked a bit with Kafka in the past and lately there is a requirement to port part of the data pipeline on AWS Kinesis Stream. Now I have read that Kinesis is effectively a fork of Kafka and share many similarities. However I have failed to…
Mangat Rai Modi
  • 5,397
  • 8
  • 45
  • 75
7
votes
2 answers

put_records() only accepts keyword arguments in Kinesis boto3 Python API

from __future__ import print_function # Python 2/3 compatibility import boto3 import json import decimal #kinesis = boto3.resource('kinesis', region_name='eu-west-1') client = boto3.client('kinesis') with open("questions.json") as json_file: …
Anshuman Ranjan
  • 180
  • 1
  • 1
  • 9
7
votes
1 answer

Auto wire kinesis stream to kinesis firehose?

I'm publishing data to a kinesis stream that is processed by some consumers. I'd like the raw data published to the stream to also be stored in s3. Is it possible to auto wire a kinesis stream to a kinesis firehose or do I need to directly publish…
devshorts
  • 8,572
  • 4
  • 50
  • 73
7
votes
2 answers

Writing to Kinesis stream using AWS Lambda Function

Can we create a Lambda function like which can get executed when we write a record to Dynamo DB table & that record is written to Kinesis stream ?? Basically can we write to Kinesis stream using Lambda function?? If yes please share sample code for…
7
votes
1 answer

If a AWS Lambda function has event sources from multiple Kinesis streams, will the batch of incoming records be from a single Kinesis stream or a mix?

The title might be a bit confusing. I'll try my best to make it clearer. Suppose I have a AWS Lambda function that has two different Kinesis streams A and B as input event sources. So, for the below, since a KinesisEvent instance contains a batch of…
Gordon Tai
  • 319
  • 1
  • 7
7
votes
2 answers

Amazon Kinesis emulator

We are evaluating real-time event processing engines (like twitter storm) One of the options is recently released Amazon Kinesis. I'm wondering if there is any sort of emulator/sandbox environment available that would allow to play around with…
diy
  • 3,590
  • 3
  • 19
  • 16
6
votes
0 answers

AWS (GovCloud) Lambda Destination Not Triggering

I am working in AWS GovCloud I have the following configuration in AWS Lambda: A Lambda function which decodes a payload A Kinesis Stream set as a trigger for the aforementioned function A Lambda Destination (we have tried Lambda functions as well…
FantasticSponge
  • 105
  • 1
  • 1
  • 9