0

I am using EPH ( EventProcessorHost) class of Azure python SDK to receive events from the eventhub. It actually uses AzureStorageCheckpointLeaseManager for checkpointing and partitioning in the storage account. But I cannot see where I can write the full path of the storage account. Like it directly create files inside the specified container in the storage account. I would like to give the full path inside the container. Where can I do that?

Ivan Glasenberg
  • 29,865
  • 2
  • 44
  • 60
Nipun
  • 4,119
  • 5
  • 47
  • 83
  • Hi, may I know why you need to provide a complete path of the blob(like https://yy3.blob.core.windows.net/test2/$Default/0, right)? – Ivan Glasenberg Dec 06 '19 at 06:21
  • right now if I have 2 partition 2 files are created directly in the container `eph-leases`. Since I will have lot of consumers group and event hubs, I would like to create a folder like `eph-leases/eventhub-namespace/eventhub-name/consumer-group/ – Nipun Dec 06 '19 at 06:23
  • You mean you want each consumer have its own checkpoint, right? – Ivan Glasenberg Dec 06 '19 at 06:25
  • No but each consumer group – Nipun Dec 06 '19 at 06:27
  • I will have a lot of consumer group and eventhubs. Would like to use the same container for all of them – Nipun Dec 06 '19 at 06:29
  • If you would like to use the same container, you can simply specify the same container, but specify different consumer group. But the folder structure should look like 'eph-leases/consumer-group', is it ok, or you persist using the structure like `eph-leases/eventhub-namespace/eventhub-name/consumer-group/ `? – Ivan Glasenberg Dec 06 '19 at 06:33
  • I am not seeing even the consumer group folder being created. I can live with the folder as consumer-group – Nipun Dec 06 '19 at 06:49
  • ok, I see. Never mind, I'll try to work it out:) – Ivan Glasenberg Dec 06 '19 at 06:50
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/203733/discussion-between-ivan-yang-and-nipun). – Ivan Glasenberg Dec 06 '19 at 07:57
  • I tried your test changes and it worked. Let me use this for now. Also looked at the changes. Thank you, very helpful – Nipun Dec 06 '19 at 11:01
  • It worked like a charm.... you can post this as answer and I will mark it – Nipun Dec 07 '19 at 13:51
  • OK, and just got the feedback from support team: They reported this as issue to product team, and if any feedback, they will let me know that. – Ivan Glasenberg Dec 09 '19 at 01:39
  • Great thanks. Can you also help me with this problem : https://stackoverflow.com/questions/59279601/hoe-to-get-the-consumer-lag-in-eventhub – Nipun Dec 11 '19 at 05:52
  • I'll take a look today. I was out of office recently and sorry for the late response. – Ivan Glasenberg Dec 17 '19 at 00:35

1 Answers1

1

Here is my research:

In AzureStorageCheckpointLeaseManager, there is a parameter storage_blob_prefix, which should be used to set blob prefix(means directory for the checkpoint blob). But actually it does not work.

After going through the source code of azure_storage_checkpoint_manager.py, I can see storage_blob_prefix is actually assigned to consumer_group_directory, but consumer_group_directory is never used to create the blob for checkpoint. Instead, it just creates the blob directly inside the container.

So the fix is that we can modify the azure_storage_checkpoint_manager.py, by using lease_container_name + consumer_group_directory to create the checkpoint blob. I modified it and uploaded it github. It can work well to create a directory for the checkpoint blob as expected.

Ivan Glasenberg
  • 29,865
  • 2
  • 44
  • 60