6

I want to get events count from Microsoft Azure EventHub. I can use EventHubReceiver.Receive(maxcount) but it is slow on big number of big events.

There is NamespaceManager.GetEventHubPartition(..).EndSequenceNumber property that seems to be doing the trick but I am not sure if it is correct approach.

Sreeram Garlapati
  • 4,877
  • 17
  • 33
Val
  • 61
  • 1
  • 3
  • 2
    I'd like to clarify what you mean by "events count". Is that the total number of messages in the hub? in the partition? sent to the hub over a certain period of time? received from the hub over a certain period of time? – Christopher Bennage Sep 17 '14 at 15:41
  • What would you want to do with the count? Remember that the count is likely going to change immediately after getting it since you're talking about a service bus that should be processing lots of messages per second... – Peter Ritchie Sep 17 '14 at 17:29
  • Yes, I want to get total number of messages in the hub. As long as I don't send new messages total number should be constant for a while - this is good enough for me. – Val Sep 18 '14 at 16:15

2 Answers2

12

EventHub doesn't have a notion of Message count, as EventHub is a high-Throughput, low-latency durable stream of events on cloud - getting the CORRECT current count at a given point of time, could be wrong the very next milli-second!! and hence, it wasn't provided :)

Hmm, we should have named EventHubs something like a StreamHub - which would make this obvious!!

If what you are looking for is - how much is the Receiver lagging behind - then EventHubClient.GetPartitionRuntimeInformation().LastEnqueuedSequenceNumber is your Best bet.

As long as no messages are sent to the partition this value remains constant :)

On the Receiver side - when a message is received - receivedEventData.SequenceNumber will indicate the Current sequence number you are processing and the diff. between EventHubClient.GetPartitionRuntimeInformation().LastEnqueuedSequenceNumber and EventData.SequenceNumber can indicate how much the Receiver of a Partition is lagging behind - based on which, the receiver process can Scale up or down the no. of Workers (work distribution logic).

more on Event Hubs...

Sreeram Garlapati
  • 4,877
  • 17
  • 33
0

You can use Stream Analytics, with a simple query:

SELECT
    COUNT(*)
FROM
    YourEventHub
GROUP BY
    TUMBLINGWINDOW(DURATION(hh, <Number of hours in which the events happened>))

Of course you will need to specify a time window, but you can potentially run it from when you started collecting data to now.

You will be able to output to SQL/Blob/Service Bus et cetera.

Then you can get the message out of the output from code and process it. It is quite complicated for a one off count, but if you need it frequently and you have to write some code around it, it could be the solution for you.

Stefano d'Antonio
  • 5,874
  • 3
  • 32
  • 45
  • 1
    refrain from using this method. 1) This will potentially increase your EventHub bill - as you would read all events from EventHub (paying for events, network etc) and end up just getting the "count"! 2) less performant - will take long time!. – Sreeram Garlapati May 08 '18 at 21:52