0

I need to translate the C# code below into something that can be used by Azure Stream Analytics.

I've got a C# application that is similar to the following:

var inputEvents = new List<Event>();
foreach (var file in files){
    (List<Event> events, DateTime maxDate) = ProcessEvents(file, inputEvents);
    inputEvents = events.Where(e => e.Duration == null).ToList();
}

Where ProcessEvents() passes inputEvents to other helper methods

I need to implement this whole code using Stream Analytics. The var file part is implemented by gathering a bunch of events using Collect(). Each batch is sent to a UDF which acts as ProcessEvents(). However, ProcessEvents() returns other events that are needed for the next iteration. Since UDF is stateless, the next batch won't be able to use the returned events from the previous batch.

How do I rewrite the C# code above in Stream Analytics?

I've tried the following:

  • Use UDA to store the returned events. Failed because for some reason it can't store a JSON array and constantly modify it.
  • Use Reference Data input to store the returned events. Failed because they can only be used in Stream Analytics using a JOIN and not within a UDF.

Stream Analytics T-SQL Query:

WITH eventsCollection AS (SELECT COLLECT() AS allEvents
FROM EventHubStreamMessage
GROUP BY SessionWindow(minute,2,4)),

step1 AS (
    SELECT UDF.SampleProcessEvents(allEvents) as Source
    FROM eventsCollection
)

SELECT *
INTO [StorageTable]
FROM step1 

Stream Analytics UDF Code (short version):

function main(allEvents){
    allEvents = JSON.stringify(allEvents);
    var inputEvents = new Array ();

    return processEvents(JSON.parse(allEvents), inputEvents)
}

function processEvents(allEvents, inputEvents){
    for (i=0l i<allEvents.length; i++){
        if (allEvents[i].Event =="ON"){
            powerOn(inputEvents);
        }
    }
}

function powerOn(inputEvents){
    return true;
}
  • Hi, UDF doesn't allow to save state. However, looking at your link query, I was thinking you can do this in 2 steps: 1 step to filter events according to your condition, another step to return the list with COLLECT. Will it work for you? – Jean-Sébastien Jan 25 '19 at 18:06
  • Hi @Jean-Sébastien, what I'm really trying to do is collect a batch of events then send it to a UDF/UDA do some complex processing which returns a batch of events (different format from the previous). This will be used by the new batch of events that will come in. Do you have any suggestions on how I can approach this? Unfortunately, I think the logic is too complex for ASA'S T-SQL that's why I'm using JavaScript instead. – CompilationError Jan 28 '19 at 19:41
  • 1
    Unfortunately, this is not supported today. One workaround would be to model this as stream-to-stream join. The first aggregation will write into Event Hub and second one will JOIN with it as a stream. – Jean-Sébastien Jan 29 '19 at 22:32

0 Answers0