Questions tagged [amazon-kinesis]

Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale.

Amazon Kinesis is a fully managed service for real-time processing of streaming data at massive scale. Amazon Kinesis can collect and process hundreds of terabytes of data per hour from hundreds of thousands of sources, allowing you to easily write applications that process information in real-time, from sources such as web site click-streams, marketing and financial information, manufacturing instrumentation and social media, and operational logs and metering data.

With Amazon Kinesis applications, you can build real-time dashboards, capture exceptions and generate alerts, drive recommendations, and make other real-time business or operational decisions. You can also easily send data to a variety of other services such as Amazon Simple Storage Service (Amazon S3), Amazon DynamoDB, or Amazon Redshift. In a few clicks and a couple of lines of code, you can start building applications which respond to changes in your data stream in seconds, at any scale, while only paying for the resources you use.

Useful links

1802 questions
10
votes
1 answer

Controlling the number of spawned futures to create backpressure

I am using a futures-rs powered version of the Rusoto AWS Kinesis library. I need to spawn a deep pipeline of AWS Kinesis requests to achieve high-throughput because Kinesis has a limit of 500 records per HTTP request. Combined with the 50ms latency…
xrl
  • 2,155
  • 5
  • 26
  • 40
10
votes
1 answer

Event Sourcing with Kinesis - Replaying and Persistence

I am trying to implement an event-driven architecture using Amazon Kinesis as the central event log of the platform. The idea is pretty much the same to the one presented by Nordstrom's with the Hello-Retail project. I have done similar things with…
10
votes
3 answers

Storing Firehose transfered files in S3 under custom directory names

We primarily do bulk transfer of incoming click stream data through Kinesis Firehose service. Our system is a multi tenant SaaS platform. The incoming click stream data are stored S3 through Firehose. By default, all the files are stored under…
Sriram V
  • 101
  • 1
  • 4
10
votes
2 answers

Is there a way to specify file extension to the file saved to s3 by kinesis firehose

I am setting up a kinesis firehose stream and everything works well with the files getting created on s3 which are delimited. But i was wondering if there is a way to specify an extension to this file since the consumer of this file require it to be…
arjunj
  • 1,436
  • 1
  • 16
  • 28
10
votes
3 answers

How to deploy and Run Amazon Kinesis Application on Amazon Kinesis service

I am trying to understand how to deploy an Amazon Kinesis Client application that was built using the Kinesis client library (KCL). I found this but it only states You can follow your own best practices for deploying code to an Amazon EC2 instance…
Sam
  • 1,333
  • 5
  • 23
  • 36
9
votes
1 answer

Exception on SubscribeToShard Command

I'm trying to subscribe to events from Kinesis Shard. But the execution of SubscribeToShardCommand hangs for 5 minutes (timeout for subscribe) and then throws an error: (node:2667) UnhandledPromiseRejectionWarning: SyntaxError: Unexpected token in…
mjpolak
  • 721
  • 6
  • 24
9
votes
4 answers

AWS CLI V2 "AWS firehose put-record" complaining about Invalid base64:

I have used to be able to send a record to firehose without any problem like this aws firehose put-record --delivery-stream-name my-stream --record='Data="{\"foor\":\"bar\"}"' But since I have updated my cli to version 2 I am getting this…
Am1rr3zA
  • 7,115
  • 18
  • 83
  • 125
9
votes
3 answers

What is the difference between Kinesis and SQS?

I know there is a lot materials online for this question, however I have not found any that can explain this question quite clearly to a rookie like me... Appreciate it if some one can help me understand the key differences between these two…
wtian
  • 178
  • 3
  • 10
9
votes
3 answers

AWS Firehose newline Character

I've read a lot of similar questions around adding newline characters to firehose, but they're all around adding the newline character to the source. The problem is that I don't have access to the source, and a third party is piping data to our…
9
votes
1 answer

what is difference between Kinesis Streams and Kinesis Firehose?

Firehose is fully managed whereas Streams is manually managed. If other people are aware of other major differences, please add them. I'm just learning. Thanks..
9
votes
3 answers

Write to a specific folder in S3 bucket using AWS Kinesis Firehose

I would like to be able to send data sent to kinesis firehose based on the content inside the data. For example if I sent this JSON data: { "name": "John", "id": 345 } I would like to filter the data based on id and send it to a subfolder of…
9
votes
1 answer

How does Kinesis achieve Kafka style Consumer Groups?

In Kafka, I can split my topic into many partitions. I cannot have more consumers than partitions in Kafka, because the partition is used as a way to scale out a topic. If I have more load, I can increase the number of partitions, which will allow…
CBP
  • 759
  • 1
  • 8
  • 18
9
votes
1 answer

How checkpoints of Kinesis spark streaming receiver works

We're using Spark Streaming connected to AWS Kinesis stream in order to aggregate (per minute) the metrics that we're receiving and writing the aggregations to influxdb in order to make them available to a real-time dashboard. Everything is working…
pVilaca
  • 1,508
  • 1
  • 12
  • 18
9
votes
3 answers

How to store Kinesis stream to S3 storage in specific folder structure within S3 bucket

I have event captured by Kinesis Stream.I want to put all events on specific folder structure on S3. I want to make a folder with date stamp like all events of 15th June should go in that folder and 16th june onwards the new folder should come to…
Sam
  • 1,333
  • 5
  • 23
  • 36
9
votes
0 answers

Which are the cons of a purely stream-based architecture against a Lambda architecture?

Disclaimer: I'm not a real-time architectures expert, I'd like only to throw a couple of personal considerations and evaluate what others would suggest or point out. Let's imagine we'd like to design a real-time analytics system. Following, Lambda…