How to save kedro dataset in azure and still have it in memory

Question

I want to save Kedro memory dataset in azure as a file and still want to have it in memory as my pipeline will be using this later in the pipeline. Is this possible in Kedro. I tried to look at Transcoding datasets but looks like not possible. Is there any other way to acheive this?

I haven't used Azure with Kedro. Is it working for you as a dataset in the catalog? Is it just a matter of saving it *and* keeping it in memory? — Tashus, Jan 18 '22 at 14:53
yes it is working as a dataset(memory dataset) i want to save it in azure container and at the same time other nodes want to use the memory dataset not read from azure. — DataEnthusiast, Jan 18 '22 at 15:02

score 4 · Accepted Answer · answered Jan 18 '22 at 15:45

4

This may be a good opportunity to use CachedDataSet this allows you to wrap any other dataset, but once it's read into memory - make it available to downstream nodes without re-performing the IO operations.

answered Jan 18 '22 at 15:45

datajoely

1,466
10
13

score 0 · Answer 2 · answered Jan 18 '22 at 15:22

0

I would try explicitly saving the dataset to Azure as part of your node logic, i.e. with catalog.save(). Then you can feed the dataset to downstream nodes in memory using the standard node inputs and outputs.

answered Jan 18 '22 at 15:22

Tashus

207
2
9

How to save kedro dataset in azure and still have it in memory

2 Answers2