3

I have an EventHub stream as input and a Service Bus Queue as output.

My query uses TUMBLINGWINDOW to aggregate events to output to the queue from the stream.

If I have to pause the Stream Analytics and resume it later from the point it stopped, does this mean that my data windows will be now delayed? Or it will catch up with the events between the last stop date and now in the first window?

E.G. I stop the analytics for an hour and resume from the stopped point, so from now on it will always process new events one hour later.

or

I stop the analytics for an hour and resume from the stopped point, the first output will contain data from one hour ago until now and the new windows will process the new data in real time.

Stefano d'Antonio
  • 5,874
  • 3
  • 32
  • 45

2 Answers2

2

Since Stream Analytics will continue reading data from where it left off, it will just ingest all of the data it can immediately, and start producing real-time data almost right away, after it has finished with all of the data that came after pausing it.

juunas
  • 54,244
  • 13
  • 113
  • 149
2

I was wondering the same thing when I created the Stream Analytics job. When you start the job for the first time you get only 2 options for the Job output start time, Now and Custom.

But when you start the job after the first time you get another option: When last stopped.

enter image description here

To resume a stopped job without losing data, choose Last stopped (note that this option isn't available if you're running a job for the first time).

Alex Albu
  • 693
  • 1
  • 12
  • 20
  • Don't take my word for it, but I recall the first time it process historical data as well (depending on the input I guess). Easy to test with EventHubs anyway. – Stefano d'Antonio Oct 06 '17 at 10:34