0

I want to save Kedro memory dataset in azure as a file and still want to have it in memory as my pipeline will be using this later in the pipeline. Is this possible in Kedro. I tried to look at Transcoding datasets but looks like not possible. Is there any other way to acheive this?

  • I haven't used Azure with Kedro. Is it working for you as a dataset in the catalog? Is it just a matter of saving it *and* keeping it in memory? – Tashus Jan 18 '22 at 14:53
  • yes it is working as a dataset(memory dataset) i want to save it in azure container and at the same time other nodes want to use the memory dataset not read from azure. – DataEnthusiast Jan 18 '22 at 15:02

2 Answers2

4

This may be a good opportunity to use CachedDataSet this allows you to wrap any other dataset, but once it's read into memory - make it available to downstream nodes without re-performing the IO operations.

datajoely
  • 1,466
  • 10
  • 13
0

I would try explicitly saving the dataset to Azure as part of your node logic, i.e. with catalog.save(). Then you can feed the dataset to downstream nodes in memory using the standard node inputs and outputs.

Tashus
  • 207
  • 2
  • 9