Questions tagged [trident]

Abstraction on top of Apache Storm for doing realtime computation.

Trident is a high-level abstraction for doing realtime computing on top of Apache Storm. It allows you to seamlessly intermix high throughput (millions of messages per second), stateful stream processing with low latency distributed querying. If you're familiar with high level batch processing tools like Pig or Cascading, the concepts of Trident will be very familiar – Trident has joins, aggregations, grouping, functions, and filters. In addition to these, Trident adds primitives for doing stateful, incremental processing on top of any database or persistence store. Trident has consistent, exactly-once semantics, so it is easy to reason about Trident topologies. (source)

122 questions
1
vote
1 answer

Storm multi-fields grouping

What I suppose to do is group the stream by two fields ("remote-client-ip", "request-params"), and count the number of tuples in each group. And combine them into a map. Here is my topology: topology.newStream("kafka-spout-stream-1", repeatSpout) …
Eageon
  • 23
  • 3
1
vote
1 answer

Word Count using Storm or Trident

For a simple word count program in the storm-starter, the logic is fairly straight-forward: 1) split sentence into words 2) emit each word 3) aggregate the count (store the count in a map) However, there are two problems here: 1) the program…
Leo Li
  • 19
  • 3
1
vote
1 answer

How to find out if a transaction is successfully committed in Apache Storm Trident

I'm trying to get started with Storm Trident and have the topology setup and running with IOpaquePartitionedTridentSpout and backed by OpaqueMap. However, I'm struggled to find out the way to let my spout/function know if a transaction is…
Sicong
  • 315
  • 2
  • 6
1
vote
1 answer

Storm Trident and Spark Streaming: distributed batch locking

After doing lots of reading and building a POC we are still unsure as to whether Storm Trident or Spark Streaming can handle our use case: We have an inbound stream of sensor data for millions of devices (which have unique identifiers). We need to…
Richard
  • 11
  • 2
1
vote
1 answer

Are storm trident batches simultaneously processed?

I would like to know whether trident batches are executed in parallel i.e. multiple batches can run at a time? Apart from this I have few questions which are too small to be posted individually. If they are quite large enough, feel free to comment…
JavaTechnical
  • 8,846
  • 8
  • 61
  • 97
1
vote
1 answer

DRPC Server error in storm

I am trying to execute the below code and getting an error .. Not sure if i am missing something here.. Also where would i see the output? Error java.lang.RuntimeException: No DRPC servers configured for topology at…
user3072054
  • 339
  • 2
  • 6
  • 17
1
vote
0 answers

Does Storm Trident newValueStream after persistentAggregate maintain partition from groupBy

I am currently trying to scale a trident topology that does some post processing after a groupBy and persistentAggregate, using newValueStream to stream values after the aggregate step. I was wondering if the tuples remained partitioned as they were…
rysloan
  • 707
  • 5
  • 16
1
vote
0 answers

Storm : How to get a sentiment indicator with Trident and a Redis State?

I am trying to create a sentiment indicator with a Redis state. Here is my code: TweetSpout twitterSpout = new TweetSpout(); Stream textStream = topology.newStream("tweetSpout", twitterSpout); textStream.each(new Fields("texto"), new…
1
vote
1 answer

Rebalance in Trident

I am wondering what is the best practice for rebalancing a Trident topology? Storm Trident topology seems to set the number of tasks according to the parallelism hints of the stream. When i run the rebalance command I cannot increase the number of…
ybensimhon
  • 113
  • 1
  • 9
1
vote
1 answer

Trident Storm-Cassandra, writing to a table with multiple primary keys

I'm learning how to use Storm's Trident with Cassandra 2.0.5, Storm version 0.9.0.1. I'm also using com.hmsonline storm-cassandra 0.4.0-rc4 contrib. My goal is simply to insert some text rows to a table with id (int), name (text) and a sentence…
Guy Wald
  • 599
  • 1
  • 10
  • 25
1
vote
1 answer

Dynamic Pivot using Storm

I have rows in BigData DB (Cassandra in my case) with column names col1,col2,col3,val1,val2 in SQL approach I can do group by col1,col2 or col2,col1 or any other possible way also. This way I can form tree hierarchy easily. But now we are using…
Koneti
  • 106
  • 4
1
vote
1 answer

Storm Trident 'merge' function that preserves time order

Say I have two streams: Stream 1: [1,3],[2,4] Stream 2: [2,5],[3,2] A regular merge would produce a Stream 3, like this: [1,3],[2,4],[2,5],[3,2] I would like to merge the stream whilst preserving the order in which the tuple was emitted, so if…
E Shindler
  • 425
  • 7
  • 27
1
vote
1 answer

To pass a value to DRPC request from spout output collector?

I am trying to implement Trident+DRPC. I have designed the topology in a way that it does not run indefinitely. I have two separate classes, one for spout implementation and the another one to implement DRPC and Trident. My spout class (a spout that…
Ezhil
  • 261
  • 2
  • 10
  • 31
1
vote
0 answers

Storm spout stops emitting when second supervisor node added

I am using TridentTopology to read from a file and emit aggregates using a single spout. When I have a single supervisor node, the topology is working fine and spout is emitting fine. However when a second supervisor node is added, the spout stops…
0
votes
0 answers

Storm Trident group by selected fields before windowing

I have a simple case study: I want to use Trident Storm to calculate the average temperature from multiple sensors. Kafka is used as the source and sink data. The data from one sensor looks like this: { "uuid":…
Savitar
  • 1
  • 1
1 2 3
8 9