Questions tagged [stream-processing]

272 questions
7
votes
1 answer

Flink Windows Boundaries, Watermark, Event Timestamp & Processing Time

Problem Definition & Establishing Concepts Let’s say we have a TumblingEventTimeWindow with size 5 minutes. And we have events containing 2 basic pieces of information: number event timestamp In this example, we kick off our Flink topology at…
7
votes
1 answer

Flink window state size and state management

After reading flink's documentation and searching around, i couldn't entirely understand how flink's handles state in its windows. Lets say i have an hourly tumbling window with an aggregation function that accumulate msgs into some java pojo or…
yaarix
  • 490
  • 7
  • 18
7
votes
2 answers

Stream processing architecture

I am in the process of designing a system where there's a main stream of objects and there are multiple workers which produces some result from that object. Finally, there is some special/unique worker (sort of a "sink", in terms of graph theory)…
7
votes
1 answer

Kafka Streams Sort Within Processing Time Window

I wonder if there's any way to sort records within a window using Kafka Streams DSL or Processor API. Imagine the following situation as an example (arbitrary one, but similar to what I need): There is a Kafka topic of some events, let's say user…
burdiyan
  • 315
  • 2
  • 12
7
votes
1 answer

Apache Apex vs Apache Flink

As both are streaming frameworks which processes event at a time, What are the core architectural differences between these two technologies/streaming framework? Also, what are some particular use cases where one is more appropriate than the other?
Biplob Biswas
  • 1,761
  • 19
  • 33
7
votes
2 answers

Lazily extract lines from large file

I'm trying to grab 5 lines by their line numbers from a large (> 1GB) file with Clojure. I'm almost there but am seeing some strange things, and I want to understand what's going on. So far I've got: (defn multi-nth [values indices] (map (partial…
David J.
  • 6,546
  • 1
  • 19
  • 14
6
votes
1 answer

efficiently store a result stream in multiple tables with optimistic locking per item

Given a result stream with a lot of items I want to store them and handle potential concurrency conflicts: public void onTriggerEvent(/* params */) { Stream results = customThreadPool.submit(/*...complex parallel computation on multiple…
Stuck
  • 11,225
  • 11
  • 59
  • 104
5
votes
1 answer

Unable connect to node with id 1: [Worker]: Error: ConnectionError('No connection to node with id')

I am trying to use robinhood / faust but without success! I have already created a producer that inserts in the original topic, in my confluent-kafka localhost instance, successfully! but the faust is unable to connect to localhost. My…
FelipeAgger
  • 51
  • 1
  • 5
5
votes
2 answers

Streaming: tumbling window vs microbatching

How is tumbling window of 5 secs in stream processing different from microbatch of 5 secs when microbatching? Both have a non-overlapping window of 5 secs during which they process the records and then move on. I understand that there is this notion…
5
votes
1 answer

Share state among operators in Flink

I wonder if it is possible in Flink to share the state among operators. Say, for instance, that I have partitioning by key on an operator and I need a piece of state of partition A inside partition C (for any reason) (fig 1.a), or I need the state…
affo
  • 453
  • 3
  • 15
5
votes
0 answers

Akka-stream UnsupportedOperationException by creating a Source from Graph

I am trying to connect a stream with a n * subFlows. Therefore I build a source from the outlet of a broadcast. But it throws an UnsupportedOperationException: cannot replace the shape of the EmptyModule. I tried to google this exception, but I…
5
votes
2 answers

How can I write a custom stream transformation in C++?

I'm learning C++ after having worked a lot with Haskell and functional languages in general, and I found that I'm constantly trying to solve the same problem: Read some data from an input stream Tokenize them based on a specific algorithm Process…
Jakub Arnold
  • 85,596
  • 89
  • 230
  • 327
5
votes
3 answers

Lamina vs Storm

I am designing a prototype realtime monitor for processing fairly large amounts (>30G/day) of streaming numeric data. I would like to write this in Clojure, as the language seems to be well suited to the kind of "Observer + state machine" system…
4
votes
1 answer

Best Design pattern to create a rules engine

Suppose I have to design a rules engine , where depending on a static configuration rule, the chain of responsibility changes at runtime. What is the best design pattern for implementing this problem? FOr e,g. depending on some configurations, a set…
Vignesh
  • 79
  • 2
  • 5
4
votes
2 answers

How do I handle out-of-order events with Apache flink?

To test out stream processing and Flink, I have given myself a seemingly simple problem. My Data stream consists of x and y coordinates for a particle along with time t at which the position was recorded. My objective is to annotate this data with…
Optimus
  • 2,716
  • 4
  • 29
  • 49
1
2
3
18 19