Questions tagged [apache-storm]

Storm is a distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language

Storm is a free, open source distributed realtime computation system. Similar to how Hadoop provides a set of general primitives for doing batch processing, Storm provides a set of general primitives for doing realtime computation. Storm is simple, can be used with any programming language.

Apache Storm is a free and open source project licensed under the Apache License, Version 2.0, and a top level project at The Apache Software Foundation (ASF).

Storm was announced during the Strange Loop 2011 conference and was open sourced during the announcement presentation. A recording of the announcement presentation can be viewed on InfoQ and the slides (pdf) have been published.

References

Related Tags


Alternate meanings

There are at least three different storms on Stack Overflow:

  • the realtime computation framework
  • a range of BlackBerry handsets
  • an Object-Releational mapper for Python (homepage)

Questions concerning the BlackBerry phone have been retagged [blackberry-storm]. There is currently no distinct tag for the OR mapper.

2571 questions
163
votes
4 answers

What is/are the main difference(s) between Flink and Storm?

Flink has been compared to Spark, which, as I see it, is the wrong comparison because it compares a windowed event processing system against micro-batching; Similarly, it does not make that much sense to me to compare Flink to Samza. In both cases…
fnl
  • 4,861
  • 4
  • 27
  • 32
112
votes
7 answers

Apache Kafka vs Apache Storm

Apache Kafka: Distributed messaging system Apache Storm: Real Time Message Processing How we can use both technologies in a real-time data pipeline for processing event data? In terms of real time data pipeline both seems to me do the job…
Ananth Duari
  • 2,859
  • 11
  • 35
  • 42
55
votes
9 answers

How can I serialize a numpy array while preserving matrix dimensions?

numpy.array.tostring doesn't seem to preserve information about matrix dimensions (see this question), requiring the user to issue a call to numpy.array.reshape. Is there a way to serialize a numpy array to JSON format while preserving this…
Louis Thibault
  • 20,240
  • 25
  • 83
  • 152
51
votes
1 answer

What is the "task" in Storm parallelism

I'm trying to learn twitter storm by following the great article "Understanding the parallelism of a Storm topology" However I'm a bit confused by the concept of "task". Is a task an running instance of the component(spout or bolt) ? A executor…
John Wang
  • 4,562
  • 9
  • 37
  • 54
48
votes
4 answers

Testing Storm Bolts and Spouts

This is a general question regarding Unit Testing Bolts and Spouts in a Storm Topology written in Java. What is the recommended practice and guideline for unit-testing (JUnit?) Bolts and Spouts? For instance, I could write a JUnit test for a Bolt,…
Jack
  • 1,250
  • 1
  • 14
  • 26
39
votes
5 answers

Storm vs. Trident: When not to use Trident?

I'm working with Storm and it is fine for a lot of use cases. Recently I had a look at Trident, which is a high-level abstraction of Storm. It supports exactly-once processing and makes stateful processing easier. But now I'm wondering.. Why can't I…
Christian Strempfer
  • 7,291
  • 6
  • 50
  • 75
35
votes
4 answers

difference between exactly-once and at-least-once guarantees

I'm studying distributed systems and referring to this old question: stackoverflow link I really can't understand the difference between exactly-once, at-least-once and at-most-once guarantees, I read these concepts in Kafka, Flink and Storm and…
Akinn
  • 1,896
  • 4
  • 23
  • 36
32
votes
3 answers

Where do Apache Samza and Apache Storm differ in their use cases?

I've stumbled upon this article that purports do contrast Samza with Storm, but it seems only to address implementation details. Where do these two distributed computation engines differ in their use cases? What kind of job is each tool good for?
Louis Thibault
  • 20,240
  • 25
  • 83
  • 152
29
votes
6 answers

Apache Storm compared to Hadoop

How does Storm compare to Hadoop? Hadoop seems to be the defacto standard for open-source large scale batch processing, does Storm has any advantages over hadoop? or Are they completely different?
18bytes
  • 5,951
  • 7
  • 42
  • 69
28
votes
4 answers

java.lang.NoSuchFieldError: INSTANCE

When trying to submit my topology through StormSubmitter, I am getting - Caused by: java.lang.NoSuchFieldError: INSTANCE at org.apache.http.impl.io.DefaultHttpRequestWriterFactory.(DefaultHttpRequestWriterFactory.java:52) I am using…
Harsh Moorjani
  • 648
  • 1
  • 5
  • 16
28
votes
2 answers

How would I split a stream in Apache Storm?

I am not understanding how I would split a stream in Apache Storm. For example, I have bolt A that after some computation has somevalue1, somevalue2, and somevalue3. It wants to send somevalue1 to bolt B, somevalue2 to bolt C, and…
james
  • 451
  • 2
  • 7
  • 12
24
votes
3 answers

How to find which dependency is pulling in a particular class file?

My project consists of some dependencies which are pulling the same common dependency. The common dependency storm-kafka has a new version 1.0.2 and an old version 0.10.0 On building a shaded jar, I see classes from both the versions in my fat jar…
user2250246
  • 3,807
  • 5
  • 43
  • 71
24
votes
3 answers

Proper way to ACK in Storm in a chain of bolts

Just want to make sure I got how Ack-ing works in Storm. I have 1 spout and 2 bolts chained together. Spout emits tuple to Bolt1 which in turn will emit a tuple to Bolt 2. I want Bolt 2 to ack the initial tuple sent from Spout and I'm not sure…
Adrian
  • 5,603
  • 8
  • 53
  • 85
24
votes
1 answer

What's causing these ParseError exceptions when reading off an AWS SQS queue in my Storm cluster

I'm using Storm 0.8.1 to read incoming messages off an Amazon SQS queue and am getting consistent exceptions when doing so: 2013-12-02 02:21:38 executor [ERROR] java.lang.RuntimeException: com.amazonaws.AmazonClientException: Unable to unmarshall…
Joel Rosenberg
  • 1,432
  • 3
  • 13
  • 15
18
votes
3 answers

Storm max spout pending

This is a question regarding how Storm's max spout pending works. I currently have a spout that reads a file and emits a tuple for each line in the file (I know Storm is not the best solution for dealing with files but I do not have a choice for…
Naresh
  • 610
  • 1
  • 4
  • 14
1
2 3
99 100